联通云7CKP平台CSK(K8s)创建Deployment 0/1失败解决方法

因为之前部署的工具里面网络排查命令不全,自己下载了netshoot镜像,然后自己添加了一些网络排查的工具

写了deploymentyaml文件

guoguo@9-会服yaml文件$ cat netshoot.yaml 
apiVersion: apps/v1
kind: Deployment
metadata:
  name: netshoot
  namespace: gxhyfw
  labels:
    app: netshoot
    env: prod
spec:
  replicas: 1
  selector:
    matchLabels:
      app: netshoot
      env: prod
  template:
    metadata:
      labels:
        app: netshoot
        env: prod
    spec:
      containers:
      - name: netshoot
        image: /xxxxx-hlccr/netshoot:latest
        args:
        - /bin/bash
        - -c
        - >
          while :; do
            echo "[$(date +%F\ %T)] hello"
            sleep 1
          done

测试执行正常

在生产环境中无论是否和业务相关的,都需要先加--dry-run
guoguo@9-会服yaml文件$ kubectl apply -f netshoot.yaml --dry-run
W0301 10:45:37.032693    4449 helpers.go:636] --dry-run is deprecated and can be replaced with --dry-run=client.
deployment.apps/netshoot created (dry run)

创建

guoguo@9-会服yaml文件$ kubectl apply -f netshoot.yaml
deployment.apps/netshoot created

但是没有发现创建的Pod

guoguo@9-会服yaml文件$ kubectl get pods -n gxhyfw | grep netshoot
guoguo@9-会服yaml文件$

查看Deployment状态为0/1

guoguo@9-会服yaml文件$ kubectl get deployment -n gxhyfw netshoot
NAME       READY   UP-TO-DATE   AVAILABLE   AGE
netshoot   0/1     0            0           72s

联通云CSK WEB页面显示已停止

联通云7CKP平台CSK(K8s)创建Deployment 0/1失败解决方法_Deployment

使用describe查看Deployment 也是正常的

guoguo@9-会服yaml文件$ kubectl describe deployment -n gxhyfw netshoot | grep -A100 Events
Events:
  Type    Reason             Age    From                   Message
  ----    ------             ----   ----                   -------
  Normal  ScalingReplicaSet  3m42s  deployment-controller  Scaled up replica set netshoot-79b9d974f5 to 1

但是describe 查看replicaset 出现报错了

报错原因没有设置资源配额策略
guoguo@9-会服yaml文件$ kubectl describe replicaset netshoot -n gxhyfw | grep -A100 Events
Events:
  Type     Reason        Age                    From                   Message
  ----     ------        ----                   ----                   -------
  Warning  FailedCreate  5m1s                   replicaset-controller  Error creating: pods "netshoot-79b9d974f5-z6fwb" is forbidden: failed quota: gxhyfw: must specify limits.cpu for: netshoot; limits.memory for: netshoot; requests.cpu for: netshoot; requests.memory for: netshoot
  Warning  FailedCreate  5m1s                   replicaset-controller  Error creating: pods "netshoot-79b9d974f5-qhl5x" is forbidden: failed quota: gxhyfw: must specify limits.cpu for: netshoot; limits.memory for: netshoot; requests.cpu for: netshoot; requests.memory for: netshoot
  Warning  FailedCreate  5m1s                   replicaset-controller  Error creating: pods "netshoot-79b9d974f5-66xrj" is forbidden: failed quota: gxhyfw: must specify limits.cpu for: netshoot; limits.memory for: netshoot; requests.cpu for: netshoot; requests.memory for: netshoot
  Warning  FailedCreate  5m1s                   replicaset-controller  Error creating: pods "netshoot-79b9d974f5-qnvtc" is forbidden: failed quota: gxhyfw: must specify limits.cpu for: netshoot; limits.memory for: netshoot; requests.cpu for: netshoot; requests.memory for: netshoot
  Warning  FailedCreate  5m1s                   replicaset-controller  Error creating: pods "netshoot-79b9d974f5-zbvsx" is forbidden: failed quota: gxhyfw: must specify limits.cpu for: netshoot; limits.memory for: netshoot; requests.cpu for: netshoot; requests.memory for: netshoot
  Warning  FailedCreate  5m1s                   replicaset-controller  Error creating: pods "netshoot-79b9d974f5-4xjn6" is forbidden: failed quota: gxhyfw: must specify limits.cpu for: netshoot; limits.memory for: netshoot; requests.cpu for: netshoot; requests.memory for: netshoot
  Warning  FailedCreate  5m1s                   replicaset-controller  Error creating: pods "netshoot-79b9d974f5-lt9zq" is forbidden: failed quota: gxhyfw: must specify limits.cpu for: netshoot; limits.memory for: netshoot; requests.cpu for: netshoot; requests.memory for: netshoot
  Warning  FailedCreate  5m                     replicaset-controller  Error creating: pods "netshoot-79b9d974f5-p7svw" is forbidden: failed quota: gxhyfw: must specify limits.cpu for: netshoot; limits.memory for: netshoot; requests.cpu for: netshoot; requests.memory for: netshoot
  Warning  FailedCreate  5m                     replicaset-controller  Error creating: pods "netshoot-79b9d974f5-nx9gk" is forbidden: failed quota: gxhyfw: must specify limits.cpu for: netshoot; limits.memory for: netshoot; requests.cpu for: netshoot; requests.memory for: netshoot
  Warning  FailedCreate  2m17s (x7 over 4m58s)  replicaset-controller  (combined from similar events): Error creating: pods "netshoot-79b9d974f5-gn4b7" is forbidden: failed quota: gxhyfw: must specify limits.cpu for: netshoot; limits.memory for: netshoot; requests.cpu for: netshoot; requests.memory for: netshoot

加上资源配置从新执行,方可成功

apiVersion: apps/v1
kind: Deployment
metadata:
  name: netshoot
  namespace: gxhyfw
  labels:
    app: netshoot
    env: prod
spec:
  replicas: 1
  selector:
    matchLabels:
      app: netshoot
      env: prod
  template:
    metadata:
      labels:
        app: netshoot
        env: prod
    spec:
      containers:
      - name: netshoot
        image: xxxxx.xxxx.xxxx/xxxxxx-hlccr/netshoot:latest
        resources:
          limits:
            cpu: "1"
            memory: "512Mi"
          requests:
            cpu: "0.5"
            memory: "256Mi"
        args:
        - /bin/bash
        - -c
        - >
          while :; do
            echo "[$(date +%F\ %T)] hello"
            sleep 1
          done

测试正常

在生产环境中无论是否和业务相关的,都需要先加--dry-run
guoguo@9-会服yaml文件$ kubectl apply -f netshoot.yaml --dry-run
W0301 10:56:24.510461    7745 helpers.go:636] --dry-run is deprecated and can be replaced with --dry-run=client.
deployment.apps/netshoot configured (dry run)

创建成功

guoguo@9-会服yaml文件$ kubectl apply -f netshoot.yaml
deployment.apps/netshoot configured
guoguo@9-会服yaml文件$ kubectl get pods -n gxhyfw | grep netshoot
netshoot-57cbcbfd95-pg9qf                      1/1     Running   0               58s

成功

联通云7CKP平台CSK(K8s)创建Deployment 0/1失败解决方法_f5_02

原因解析

资源配额(ResourceQuota)的强制约束 Kubernetes 的ResourceQuota可以限制命名空间内所有资源的总体使用量。如果命名空间启用了ResourceQuota,则所有 Pod 必须显式声明资源请求和限制(如requests.cpu、limits.memory等),否则会被拒绝创建。

联通云 CKP 平台的 CSK(容器服务 Kubernetes)组件启用了资源配额(ResourceQuota)和限制范围(LimitRange)

  • 错误日志中的 failed quota: gxhyfw 直接表明该命名空间存在配额限制。
  • netshoot Pod 未在容器中定义 resources 字段,因此触发了配额验证失败。