问题描述

使用ansible安装Kubernetes,最后出现如所示报错,提示kubelet启动异常

kubelet 无法启动排查_配置文件

TASK [kube-node : 轮询等待kubelet启动] ******************************************************************************************************************************
fatal: [192.168.10.52]: FAILED! => {"attempts": 4, "changed": true, "cmd": "systemctl is-active kubelet.service", "delta": "0:00:00.006796", "end": "2023-02-01 22:30:10.756458", "msg": "non-zero return code", "rc": 3, "start": "2023-02-01 22:30:10.749662", "stderr": "", "stderr_lines": [], "stdout": "activating", "stdout_lines": ["activating"]}
fatal: [192.168.10.51]: FAILED! => {"attempts": 4, "changed": true, "cmd": "systemctl is-active kubelet.service", "delta": "0:00:00.010879", "end": "2023-02-01 22:30:10.859450", "msg": "non-zero return code", "rc": 3, "start": "2023-02-01 22:30:10.848571", "stderr": "", "stderr_lines": [], "stdout": "activating", "stdout_lines": ["activating"]}

PLAY RECAP **********************************************************************************************************************************************************
192.168.10.51 : ok=50 changed=30 unreachable=0 failed=1 skipped=1 rescued=0 ignored=0
192.168.10.52 : ok=49 changed=30 unreachable=0 failed=1 skipped=0 rescued=0 ignored=0

问题排查

检查kubelet状态,显示没启动成功

kubelet 无法启动排查_配置文件_02

使用journalctl -u kubelet --no-pager 查看启动报错日志

Dec 07 23:50:21 iZ2vc2h2j9l2p8zqnwy6zoZ kubelet[24786]: E1207 23:50:21.347929   24786 remote_runtime.go:168] "Version from runtime service failed" err="rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.RuntimeService"
Dec 07 23:50:21 iZ2vc2h2j9l2p8zqnwy6zoZ kubelet[24786]: E1207 23:50:21.348041 24786 kuberuntime_manager.go:225] "Get runtime version failed" err="get remote runtime typed version failed: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.RuntimeService"
Dec 07 23:50:21 iZ2vc2h2j9l2p8zqnwy6zoZ kubelet[24786]: Error: failed to run Kubelet: failed to create kubelet: get remote runtime typed version failed: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.RuntimeService

依据报错应该是containerd的问题,确认containerd状态

kubelet 无法启动排查_问题排查_03

应该和配置文件/etc/containerd/config.toml中的disabled_plugins = ["cri"]有关,详情参见​​https://github.com/containerd/containerd/issues/4581​

移除/etc/containerd/config.toml配置文件

grep "disabled_plugins" /etc/containerd/config.toml
mv /etc/containerd/config.toml /tmp/

重启 kubelet 成功

kubelet 无法启动排查_问题排查_04

问题原因

​https://github.com/containerd/containerd/issues/4581​​、

解决办法

mv /etc/containerd/config.toml /tmp
systemctl restart containerd
systemctl restart kubelet