错误描述

containerd 服务启动失败,使用命令 ​​journalctl -xe -u containerd​​ 查看日志如下所示:

[root@k8s-dev-node2 /]# journalctl -xe -u containerd
-- Unit containerd.service has begun starting up.
2月 09 15:19:31 k8s-dev-node2 systemd[1]: Started containerd container runtime.
-- Subject: Unit containerd.service has finished start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit containerd.service has finished starting up.
--
-- The start-up result is done.
2月 09 15:19:31 k8s-dev-node2 containerd[2317]: time="2023-02-09T15:19:31.493003002+08:00" level=info msg="starting containerd" revision=5b46e404f6b9f661a205e28d59c982d3634148f8 version=v1.4.11
2月 09 15:19:31 k8s-dev-node2 containerd[2317]: time="2023-02-09T15:19:31.519295252+08:00" level=info msg="loading plugin \"io.containerd.content.v1.content\"..." type=io.containerd.content.v1
2月 09 15:19:31 k8s-dev-node2 containerd[2317]: time="2023-02-09T15:19:31.519366008+08:00" level=info msg="loading plugin \"io.containerd.snapshotter.v1.aufs\"..." type=io.containerd.snapshotter.v1
2月 09 15:19:31 k8s-dev-node2 containerd[2317]: time="2023-02-09T15:19:31.520666359+08:00" level=info msg="skip loading plugin \"io.containerd.snapshotter.v1.aufs\"..." error="aufs is not supported (modprobe aufs failed: exit status 1 \"modprobe: FATAL: Module aufs not
2月 09 15:19:31 k8s-dev-node2 containerd[2317]: time="2023-02-09T15:19:31.520700319+08:00" level=info msg="loading plugin \"io.containerd.snapshotter.v1.btrfs\"..." type=io.containerd.snapshotter.v1
2月 09 15:19:31 k8s-dev-node2 containerd[2317]: time="2023-02-09T15:19:31.520858968+08:00" level=info msg="skip loading plugin \"io.containerd.snapshotter.v1.btrfs\"..." error="path /var/lib/containerd/io.containerd.snapshotter.v1.btrfs (xfs) must be a btrfs filesystem
2月 09 15:19:31 k8s-dev-node2 containerd[2317]: time="2023-02-09T15:19:31.520876555+08:00" level=info msg="loading plugin \"io.containerd.snapshotter.v1.devmapper\"..." type=io.containerd.snapshotter.v1
2月 09 15:19:31 k8s-dev-node2 containerd[2317]: time="2023-02-09T15:19:31.520924961+08:00" level=warning msg="failed to load plugin io.containerd.snapshotter.v1.devmapper" error="devmapper not configured"
2月 09 15:19:31 k8s-dev-node2 containerd[2317]: time="2023-02-09T15:19:31.520939594+08:00" level=info msg="loading plugin \"io.containerd.snapshotter.v1.native\"..." type=io.containerd.snapshotter.v1
2月 09 15:19:31 k8s-dev-node2 containerd[2317]: time="2023-02-09T15:19:31.520962861+08:00" level=info msg="loading plugin \"io.containerd.snapshotter.v1.overlayfs\"..." type=io.containerd.snapshotter.v1
2月 09 15:19:31 k8s-dev-node2 containerd[2317]: time="2023-02-09T15:19:31.521050820+08:00" level=info msg="loading plugin \"io.containerd.snapshotter.v1.zfs\"..." type=io.containerd.snapshotter.v1
2月 09 15:19:31 k8s-dev-node2 containerd[2317]: time="2023-02-09T15:19:31.521197935+08:00" level=info msg="skip loading plugin \"io.containerd.snapshotter.v1.zfs\"..." error="path /var/lib/containerd/io.containerd.snapshotter.v1.zfs must be a zfs filesystem to be used w
2月 09 15:19:31 k8s-dev-node2 containerd[2317]: time="2023-02-09T15:19:31.521213914+08:00" level=info msg="loading plugin \"io.containerd.metadata.v1.bolt\"..." type=io.containerd.metadata.v1
2月 09 15:19:31 k8s-dev-node2 containerd[2317]: time="2023-02-09T15:19:31.521245287+08:00" level=warning msg="could not use snapshotter devmapper in metadata plugin" error="devmapper not configured"
2月 09 15:19:31 k8s-dev-node2 containerd[2317]: time="2023-02-09T15:19:31.521258310+08:00" level=info msg="metadata content store policy set" policy=shared
2月 09 15:19:31 k8s-dev-node2 containerd[2317]: panic: invalid freelist page: 162, page type is leaf
2月 09 15:19:31 k8s-dev-node2 containerd[2317]: goroutine 1 [running]:
2月 09 15:19:31 k8s-dev-node2 containerd[2317]: github.com/containerd/containerd/vendor/go.etcd.io/bbolt.(*freelist).read(0xc0003ba500, 0x7f406013d000)
2月 09 15:19:31 k8s-dev-node2 containerd[2317]: /tmp/tmp.Fbs9SsdNvu/src/github.com/containerd/containerd/vendor/go.etcd.io/bbolt/freelist.go:266 +0x30b
2月 09 15:19:31 k8s-dev-node2 containerd[2317]: github.com/containerd/containerd/vendor/go.etcd.io/bbolt.(*DB).loadFreelist.func1()
2月 09 15:19:31 k8s-dev-node2 containerd[2317]: /tmp/tmp.Fbs9SsdNvu/src/github.com/containerd/containerd/vendor/go.etcd.io/bbolt/db.go:316 +0xd4
2月 09 15:19:31 k8s-dev-node2 containerd[2317]: sync.(*Once).doSlow(0xc0000c6568, 0xc00062b390)
2月 09 15:19:31 k8s-dev-node2 containerd[2317]: /usr/local/go/src/sync/once.go:68 +0xee
2月 09 15:19:31 k8s-dev-node2 containerd[2317]: sync.(*Once).Do(...)
2月 09 15:19:31 k8s-dev-node2 containerd[2317]: /usr/local/go/src/sync/once.go:59
2月 09 15:19:31 k8s-dev-node2 containerd[2317]: github.com/containerd/containerd/vendor/go.etcd.io/bbolt.(*DB).loadFreelist(0xc0000c6400)
2月 09 15:19:31 k8s-dev-node2 containerd[2317]: /tmp/tmp.Fbs9SsdNvu/src/github.com/containerd/containerd/vendor/go.etcd.io/bbolt/db.go:309 +0x6c
2月 09 15:19:31 k8s-dev-node2 containerd[2317]: github.com/containerd/containerd/vendor/go.etcd.io/bbolt.Open(0xc000399940, 0x3a, 0x1a4, 0x23e0aa0, 0xc0000d2ab8, 0x1, 0x1)
2月 09 15:19:31 k8s-dev-node2 containerd[2317]: /tmp/tmp.Fbs9SsdNvu/src/github.com/containerd/containerd/vendor/go.etcd.io/bbolt/db.go:286 +0x3af
2月 09 15:19:31 k8s-dev-node2 containerd[2317]: github.com/containerd/containerd/services/server.LoadPlugins.func2(0xc0003ba480, 0xc00003fb00, 0x2, 0x2, 0x1852100)
2月 09 15:19:31 k8s-dev-node2 containerd[2317]: /tmp/tmp.Fbs9SsdNvu/src/github.com/containerd/containerd/services/server/server.go:380 +0x885
2月 09 15:19:31 k8s-dev-node2 containerd[2317]: github.com/containerd/containerd/plugin.(*Registration).Init(0xc00042a240, 0xc0003ba480, 0x1836e60)
2月 09 15:19:31 k8s-dev-node2 containerd[2317]: /tmp/tmp.Fbs9SsdNvu/src/github.com/containerd/containerd/plugin/plugin.go:110 +0x3a
2月 09 15:19:31 k8s-dev-node2 containerd[2317]: github.com/containerd/containerd/services/server.New(0x1a7b010, 0xc000122000, 0xc000001500, 0x1, 0x1, 0xc00041c030)
2月 09 15:19:31 k8s-dev-node2 containerd[2317]: /tmp/tmp.Fbs9SsdNvu/src/github.com/containerd/containerd/services/server/server.go:168 +0xce5
2月 09 15:19:31 k8s-dev-node2 containerd[2317]: github.com/containerd/containerd/cmd/containerd/command.App.func1(0xc0000fa160, 0x0, 0x0)
2月 09 15:19:31 k8s-dev-node2 containerd[2317]: /tmp/tmp.Fbs9SsdNvu/src/github.com/containerd/containerd/cmd/containerd/command/main.go:177 +0x7cc
2月 09 15:19:31 k8s-dev-node2 containerd[2317]: github.com/containerd/containerd/vendor/github.com/urfave/cli.HandleAction(0x1803080, 0x1a284a8, 0xc0000fa160, 0x0, 0x0)
2月 09 15:19:31 k8s-dev-node2 containerd[2317]: /tmp/tmp.Fbs9SsdNvu/src/github.com/containerd/containerd/vendor/github.com/urfave/cli/app.go:523 +0x107
2月 09 15:19:31 k8s-dev-node2 systemd[1]: containerd.service: main process exited, code=exited, status=2/INVALIDARGUMENT
2月 09 15:19:31 k8s-dev-node2 containerd[2317]: github.com/containerd/containerd/vendor/github.com/urfave/cli.(*App).Run(0xc00052a000, 0xc00012c000, 0x5, 0x5, 0x0, 0x0)
2月 09 15:19:31 k8s-dev-node2 containerd[2317]: /tmp/tmp.Fbs9SsdNvu/src/github.com/containerd/containerd/vendor/github.com/urfave/cli/app.go:285 +0x655
2月 09 15:19:31 k8s-dev-node2 containerd[2317]: main.main()
2月 09 15:19:31 k8s-dev-node2 containerd[2317]: github.com/containerd/containerd/cmd/containerd/main.go:33 +0x51
2月 09 15:19:31 k8s-dev-node2 systemd[1]: Unit containerd.service entered failed state.
2月 09 15:19:31 k8s-dev-node2 systemd[1]: containerd.service failed.
lines 667-721/721 (END)

根据错误日志信息,在 ​​metadata content store policy set​​​ 之后就接着报错 ​​panic: invalid freelist page: 162, page type is leaf​​​,判断可能是 ​​meta.db​​ 文件损坏导致。

问题解决

1、查看 containerd 路径

[root@k8s-dev-node2 /]# systemctl cat containerd | grep ExecStart
ExecStartPre=/sbin/modprobe overlay
ExecStart=/usr/bin/containerd

2、查找 ​​meta.db​​ 文件所在位置

[root@k8s-dev-node2 containerd]# find /var/lib/containerd -type f -size -5M -name '*.db*' | grep -v overlay
/var/lib/containerd/io.containerd.metadata.v1.bolt/meta.db

3、删除/重命名 ​​meta.db​​​ 文件后启动 ​​containerd​​ 服务

[root@k8s-dev-node2 /]# mv /var/lib/containerd/io.containerd.metadata.v1.bolt/meta.db{,.bak}
[root@k8s-dev-node2 /]# systemctl restart containerd

4、最后查看 containerd 服务状态为 ​​active (running)​​ 成功

[root@k8s-dev-node2 /]# systemctl status containerd
● containerd.service - containerd container runtime
Loaded: loaded (/usr/lib/systemd/system/containerd.service; enabled; vendor preset: disabled)
Active: active (running) since 四 2023-02-09 15:28:32 CST; 5s ago
Docs: https://containerd.io
Process: 2454 ExecStartPre=/sbin/modprobe overlay (code=exited, status=0/SUCCESS)
Main PID: 2456 (containerd)
Tasks: 14
Memory: 22.0M
CGroup: /system.slice/containerd.service
└─2456 /usr/bin/containerd # containerd

最后就可以启动 docker 服务了。


如果是 docker,和 containerd 相似,查找到 db 文件,备份改名重启服务,如下代码示例所示:

[root@k8s-dev-node1 /]# find /var/lib/docker -type f -size -5M -name '*.db*' | grep -v overlay
/var/lib/docker/volumes/metadata.db
/var/lib/docker/network/files/local-kv.db
/var/lib/docker/buildkit/containerdmeta.db
/var/lib/docker/buildkit/snapshots.db
/var/lib/docker/buildkit/metadata_v2.db
/var/lib/docker/buildkit/cache.db
[root@k8s-dev-node1 /]# mv /var/lib/docker/network/files/local-kv.db{,.bak}
[root@k8s-dev-node1 /]# systemctl restart docker

重点关注尝试 ​​metadata.db​​​ 和 ​​local-kv.db​​ 这两个 db 文件。


(END)