kubernetes 运行硝基飞地和亚马逊EKS和获得不足的hugepages-2 Mi的豆荚

siotufzp  于 5个月前  发布在  Kubernetes
关注(0)|答案(1)|浏览(65)

我正在跟随this article在EKS上使用硝基飞地。我的豆荚给我一个警告,并停留在挂起状态。

0/2 nodes are available: 2 Insufficient aws.ec2.nitro/nitro_enclaves, 2 
Insufficient hugepages-2Mi. preemption: 0/2 nodes are available: 
2 No preemption victims found for incoming pod.

字符串
在检查节点时,我看到以下内容:

kubectl describe node ip-x.us-east-2.compute.internal | grep -A 8 "Allocated resources:"
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests    Limits
  --------           --------    ------
  cpu                325m (4%)   0 (0%)
  memory             140Mi (0%)  340Mi (2%)
  ephemeral-storage  0 (0%)      0 (0%)
  hugepages-1Gi      0 (0%)      0 (0%)
  hugepages-2Mi      0 (0%)      0 (0%)

kubectl describe node ip-x.us-east-2.compute.internal | grep -A 13 "Capacity:"                                                                                                                                                                                          
Capacity:
  cpu:                8
  ephemeral-storage:  83873772Ki
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             15896064Ki
  pods:               29
Allocatable:
  cpu:                7910m
  ephemeral-storage:  76224326324
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             14879232Ki
  pods:               29


Pod定义包括:

"containers": [
      {
        "name": "hello-container",
        "image": "hello-f9c725ee-4d02-4f48-8c3f-f341a754061b:latest",
        "command": [
          "/home/run.sh"
        ],
        "resources": {
          "limits": {
            "aws.ec2.nitro/nitro_enclaves": "1",
            "cpu": "250m",
            "hugepages-2Mi": "100Mi"
          },
          "requests": {
            "aws.ec2.nitro/nitro_enclaves": "1",
            "cpu": "250m",
            "hugepages-2Mi": "100Mi"
          }
        },


我尝试过的事情:在阅读了其他几篇文章后,尝试了垂直和水平扩展,并重新启动了Kubelet服务,但没有成功,pod仍然处于挂起状态。

41zrol4v

41zrol4v1#

我认为这里可能有两个潜在的问题,一个与缺少hugepages-2Mi有关,另一个与缺少aws.ec2.nitro/nitro_enclaves有关。
对于hugepages-2Mi,请确保在步骤1中创建的启动模板实际应用于提供nitro-providing EKS节点组中的节点,并且在该启动模板上正确设置了user data。请注意,如果您修改了user data以提供1024的倍数的MB,而不是hugepages-2Mi,则会得到hugepages-1Gi,如步骤5.1中limits下所述。
对于aws.ec2.nitro/nitro_enclaves,您需要确保https://raw.githubusercontent.com/aws/aws-nitro-enclaves-k8s-device-plugin/main/aws-nitro-enclaves-k8s-ds.yaml提供的DaemonSet的Pod正在启用nitro-enabled的节点上运行。它可能会丢失,因为DaemonSet未正确添加到K8 S,或者因为启用nitro-enabled的节点的节点标签不正确(它们应该是aws-nitro-enclaves-k8s-dp=enabled,在kubectl describe node中应该是可见的)。如果DaemonSet pod实际上已经启动并运行,它也可能有问题。您可以使用kubectl logs --namespace=kube-system -l name=aws-nitro-enclaves-k8s-dp --tail=1000进行检查

相关问题