对Pod的健康状态检查可以通过两类探针来检查:LIvenessProbe和ReadinessProbe。
一、探针的种类
livenessProbe:用于判断容器是否存活(running状态)。如果探测到容器不健康,则kubelet将杀掉该容器,并根据容器的重启策略做相应的处理。如果一个容器不包含LivenessProbe探针,那么kubelet人为该容器的LivenessProbe探针返回值永远为success。
readinessProbe:用于判断容器是否启动完成(ready状态),可以接收请求。如果检测到失败,则pod的状态将被修改。endpoint controller将从service的endpoint 中删除包含该容器所在pod的endpoint。
二、LivenessProbe三种检测方法:
kubelet定期执行LivenessProbe探针来诊断容器的健康状况。
1. exec:在容器内部执行一条命令,如果该命令的返回码为0,则表明容器健康。
2. tcpSocket:通过容器的IP地址和端口号执行TCP检查,如果能够建立TCP连接,则表明容器健康。
3. httpGet:通过容器的IP地址、端口号及路径调用http Get方法,如果响应的状态码大于等于200且小于400,则认为容器状态健康。
三、liveness探针的exec使用
[root@kub_master k8s]# mkdir healthy
[root@kub_master k8s]# cd healthy/
1. 创建pod资源
[root@kub_master healthy]# vim pod_nginx_exec.yaml
[root@kub_master healthy]# cat pod_nginx_exec.yaml
apiVersion: v1
kind: Pod
metadata:
name: liveness-exec
spec:
containers:
- name: nginx
image: 192.168.0.212:5000/nginx:1.13
ports:
- containerPort: 80
args:
- /bin/sh
- -c
- touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 600
livenessProbe:
exec:
command:
- cat
- /tmp/healthy
initialDelaySeconds: 5
periodSeconds: 5
timeoutSeconds: 1
通过执行命令“cat /tmp/healthy”来判断容器运行是否正常。而该pod运行之后,在创建/tmp/healthy文件30s后删除该文件,而livenessProbe健康检查的初始探测时间为5s,探测结果将是fail,将导致kubelet杀掉该容器,并重启它。
2. 测试
[root@kub_master healthy]# kubectl create -f pod_nginx_exec.yaml
pod "liveness-exec" created
[root@kub_master healthy]# kubectl get pods
NAME READY STATUS RESTARTS AGE
busybox2 1/1 Running 4 4h
liveness-exec 1/1 Running 0 8s
mysql-ms870 1/1 Running 0 3h
mysql-wp-3651026459-v31gc 1/1 Running 0 5h
myweb-5zpl5 1/1 Running 0 3h
myweb-g09wf 1/1 Running 0 3h
wp-deployment-3182043070-2jmkb 1/1 Running 0 5h
wp-deployment-3182043070-r7bmq 1/1 Running 0 5h
[root@kub_master healthy]# kubectl describe pod liveness-exec
Name: liveness-exec
Namespace: default
Node: 192.168.0.184/192.168.0.184
Start Time: Sat, 26 Sep 2020 23:34:22 +0800
Labels: <none>
Status: Running
IP: 172.16.46.4
Controllers: <none>
Containers:
nginx:
Container ID: docker://c12b52e50014546302596a2e235f2d701471e7cf681d993653123fe62ab0e746
Image: 192.168.0.212:5000/nginx:1.13
Image ID: docker-pullable://192.168.0.212:5000/nginx@sha256:e4f0474a75c510f40b37b6b7dc2516241ffa8bde5a442bde3d372c9519c84d90
Port: 80/TCP
Args:
/bin/sh
-c
touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 600
State: Running
Started: Sat, 26 Sep 2020 23:34:23 +0800
Ready: True
Restart Count: 0
Liveness: exec [cat /tmp/healthy] delay=5s timeout=1s period=5s #success=1 #failure=3
Volume Mounts: <none>
Environment Variables: <none>
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
No volumes.
QoS Class: BestEffort
Tolerations: <none>
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
25s 25s 1 {default-scheduler } Normal Scheduled Successfully assigned liveness-exec to 192.168.0.184
25s 25s 1 {kubelet 192.168.0.184} spec.containers{nginx} Normal Pulling pulling image "192.168.0.212:5000/nginx:1.13"
25s 25s 1 {kubelet 192.168.0.184} spec.containers{nginx} Normal Pulled Successfully pulled image "192.168.0.212:5000/nginx:1.13"
24s 24s 1 {kubelet 192.168.0.184} spec.containers{nginx} Normal Created Created container with docker id c12b52e50014; Security:[seccomp=unconfined]
24s 24s 1 {kubelet 192.168.0.184} spec.containers{nginx} Normal Started Started container with docker id c12b52e50014
[root@kub_master healthy]# kubectl describe pod liveness-exec
Name: liveness-exec
Namespace: default
Node: 192.168.0.184/192.168.0.184
Start Time: Sat, 26 Sep 2020 23:34:22 +0800
Labels: <none>
Status: Running
IP: 172.16.46.4
Controllers: <none>
Containers:
nginx:
Container ID: docker://26ad03feb4f89e4a16d7a826adcbae2c433d200aa204d416e71a922472794172
Image: 192.168.0.212:5000/nginx:1.13
Image ID: docker-pullable://192.168.0.212:5000/nginx@sha256:e4f0474a75c510f40b37b6b7dc2516241ffa8bde5a442bde3d372c9519c84d90
Port: 80/TCP
Args:
/bin/sh
-c
touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 600
State: Running
Started: Sat, 26 Sep 2020 23:35:37 +0800
Last State: Terminated
Reason: Error
Exit Code: 137
Started: Sat, 26 Sep 2020 23:34:23 +0800
Finished: Sat, 26 Sep 2020 23:35:37 +0800
Ready: True
Restart Count: 1
Liveness: exec [cat /tmp/healthy] delay=5s timeout=1s period=5s #success=1 #failure=3
Volume Mounts: <none>
Environment Variables: <none>
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
No volumes.
QoS Class: BestEffort
Tolerations: <none>
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
1m 1m 1 {default-scheduler } Normal Scheduled Successfully assigned liveness-exec to 192.168.0.184
1m 1m 1 {kubelet 192.168.0.184} spec.containers{nginx} Normal Pulling pulling image "192.168.0.212:5000/nginx:1.13"
1m 1m 1 {kubelet 192.168.0.184} spec.containers{nginx} Normal Pulled Successfully pulled image "192.168.0.212:5000/nginx:1.13"
1m 1m 1 {kubelet 192.168.0.184} spec.containers{nginx} Normal Created Created container with docker id c12b52e50014; Security:[seccomp=unconfined]
1m 1m 1 {kubelet 192.168.0.184} spec.containers{nginx} Normal Started Started container with docker id c12b52e50014
42s 32s 3 {kubelet 192.168.0.184} spec.containers{nginx} Warning Unhealthy Liveness probe failed: cat: /tmp/healthy: No such file or directory
2s 2s 1 {kubelet 192.168.0.184} spec.containers{nginx} Normal Killing Killing container with docker id c12b52e50014: pod "liveness-exec_default(c00e9e7c-000d-11eb-8a8e-fa163e38ad0d)" container "nginx" is unhealthy, it will be killed and re-created.
2s 2s 1 {kubelet 192.168.0.184} spec.containers{nginx} Normal Pulled Container image "192.168.0.212:5000/nginx:1.13" already present on machine
2s 2s 1 {kubelet 192.168.0.184} spec.containers{nginx} Normal Created Created container with docker id 26ad03feb4f8; Security:[seccomp=unconfined]
2s 2s 1 {kubelet 192.168.0.184} spec.containers{nginx} Normal Started Started container with docker id 26ad03feb4f8
[root@kub_master healthy]# kubectl get pods
NAME READY STATUS RESTARTS AGE
busybox2 1/1 Running 4 4h
liveness-exec 1/1 Running 1 1m
mysql-ms870 1/1 Running 0 3h
mysql-wp-3651026459-v31gc 1/1 Running 0 5h
myweb-5zpl5 1/1 Running 0 3h
myweb-g09wf 1/1 Running 0 3h
wp-deployment-3182043070-2jmkb 1/1 Running 0 5h
wp-deployment-3182043070-r7bmq 1/1 Running 0 5h
四、liveness探针的httpGet使用
1. 创建pod资源
[root@kub_master healthy]# vim pod_nginx_httpGet.yaml
[root@kub_master healthy]# cat pod_nginx_httpGet.yaml
apiVersion: v1
kind: Pod
metadata:
name: liveness-httpget
spec:
containers:
- name: nginx
image: 192.168.0.212:5000/nginx:1.13
ports:
- containerPort: 80
livenessProbe:
httpGet:
path: /index.html
port: 80
initialDelaySeconds: 3
timeoutSeconds: 3
[root@kub_master healthy]# kubectl create -f pod_nginx_httpGet.yaml
pod "liveness-httpget" created
[root@kub_master healthy]# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE
liveness-httpget 1/1 Running 0 11s 172.16.46.2 192.168.0.184
[root@kub_master healthy]# kubectl describe pod liveness-httpget
Name: liveness-httpget
Namespace: default
Node: 192.168.0.184/192.168.0.184
Start Time: Sat, 26 Sep 2020 23:48:31 +0800
Labels: <none>
Status: Running
IP: 172.16.46.2
Controllers: <none>
Containers:
nginx:
Container ID: docker://2212178360bc1615e54e4cd0f3901b6a61de29868759af7efd5b0bf2874ea97a
Image: 192.168.0.212:5000/nginx:1.13
Image ID: docker-pullable://192.168.0.212:5000/nginx@sha256:e4f0474a75c510f40b37b6b7dc2516241ffa8bde5a442bde3d372c9519c84d90
Port: 80/TCP
State: Running
Started: Sat, 26 Sep 2020 23:48:31 +0800
Ready: True
Restart Count: 0
Liveness: http-get http://:80/index.html delay=3s timeout=3s period=10s #success=1 #failure=3
Volume Mounts: <none>
Environment Variables: <none>
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
No volumes.
QoS Class: BestEffort
Tolerations: <none>
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
48s 48s 1 {default-scheduler } Normal Scheduled Successfully assigned liveness-httpget to 192.168.0.184
48s 48s 1 {kubelet 192.168.0.184} spec.containers{nginx} Normal Pulled Container image "192.168.0.212:5000/nginx:1.13" already present on machine
48s 48s 1 {kubelet 192.168.0.184} spec.containers{nginx} Normal Created Created container with docker id 2212178360bc; Security:[seccomp=unconfined]
48s 48s 1 {kubelet 192.168.0.184} spec.containers{nginx} Normal Started Started container with docker id 2212178360bc
kubelet定时发送http请求到localhost:80/index.html来进行容器应用的健康检查
2. 测试
上述可看容器正常,进入容器,移走index.html文件,检测容器是否正常
[root@kub_master healthy]# kubectl exec -it liveness-httpget bash
root@liveness-httpget:/# ls /usr/share/nginx/html/
50x.html index.html
root@liveness-httpget:/# mv /usr/share/nginx/html/index.html /tmp
root@liveness-httpget:/# ls /usr/share/nginx/html/
50x.html
root@liveness-httpget:/# exit
exit
[root@kub_master healthy]# kubectl describe pod liveness-httpget
Name: liveness-httpget
Namespace: default
Node: 192.168.0.184/192.168.0.184
Start Time: Sat, 26 Sep 2020 23:48:31 +0800
Labels: <none>
Status: Running
IP: 172.16.46.2
Controllers: <none>
Containers:
nginx:
Container ID: docker://2e4358ea39f3967afd15988e9d731105b901e353e44f904769cacadf3910ec33
Image: 192.168.0.212:5000/nginx:1.13
Image ID: docker-pullable://192.168.0.212:5000/nginx@sha256:e4f0474a75c510f40b37b6b7dc2516241ffa8bde5a442bde3d372c9519c84d90
Port: 80/TCP
State: Running
Started: Sat, 26 Sep 2020 23:55:51 +0800
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Sat, 26 Sep 2020 23:48:31 +0800
Finished: Sat, 26 Sep 2020 23:55:51 +0800
Ready: True
Restart Count: 1
Liveness: http-get http://:80/index.html delay=3s timeout=3s period=10s #success=1 #failure=3
Volume Mounts: <none>
Environment Variables: <none>
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
No volumes.
QoS Class: BestEffort
Tolerations: <none>
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
8m 8m 1 {default-scheduler } Normal Scheduled Successfully assigned liveness-httpget to 192.168.0.184
8m 8m 1 {kubelet 192.168.0.184} spec.containers{nginx} Normal Created Created container with docker id 2212178360bc; Security:[seccomp=unconfined]
8m 8m 1 {kubelet 192.168.0.184} spec.containers{nginx} Normal Started Started container with docker id 2212178360bc
8m 45s 2 {kubelet 192.168.0.184} spec.containers{nginx} Normal Pulled Container image "192.168.0.212:5000/nginx:1.13" already present on machine
1m 45s 3 {kubelet 192.168.0.184} spec.containers{nginx} Warning Unhealthy Liveness probe failed: HTTP probe failed with statuscode: 404
45s 45s 1 {kubelet 192.168.0.184} spec.containers{nginx} Normal Killing Killing container with docker id 2212178360bc: pod "liveness-httpget_default(ba0c49a6-000f-11eb-8a8e-fa163e38ad0d)" container "nginx" is unhealthy, it will be killed and re-created.
45s 45s 1 {kubelet 192.168.0.184} spec.containers{nginx} Normal Created Created container with docker id 2e4358ea39f3; Security:[seccomp=unconfined]
45s 45s 1 {kubelet 192.168.0.184} spec.containers{nginx} Normal Started Started container with docker id 2e4358ea39f3
[root@kub_master healthy]# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE
liveness-httpget 1/1 Running 1 8m 172.16.46.2 192.168.0.184
五、liveness探针的tcpSocket使用
1. 创建pod资源
[root@kub_master healthy]# vim pod_nginx_tcpSocket.yaml
[root@kub_master healthy]# cat pod_nginx_tcpSocket.yaml
apiVersion: v1
kind: Pod
metadata:
name: liveness-tcpsocket
spec:
containers:
- name: nginx
image: 192.168.0.212:5000/nginx:1.13
ports:
- containerPort: 80
livenessProbe:
tcpSocket:
port: 80
initialDelaySeconds: 3
timeoutSeconds: 3
[root@kub_master healthy]# kubectl create -f pod_nginx_tcpSocket.yaml
pod "liveness-tcpsocket" created
[root@kub_master healthy]# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE
liveness-httpget 1/1 Running 1 14m 172.16.46.2 192.168.0.184
liveness-tcpsocket 1/1 Running 0 15s 172.16.81.3 192.168.0.212
[root@kub_master healthy]# kubectl describe pod liveness-tcpsocket
Name: liveness-tcpsocket
Namespace: default
Node: 192.168.0.212/192.168.0.212
Start Time: Sun, 27 Sep 2020 00:03:13 +0800
Labels: <none>
Status: Running
IP: 172.16.81.3
Controllers: <none>
Containers:
nginx:
Container ID: docker://edfb206ebf8036227b960e0bab346c55c065dae10e8c0f1b447ffef0b74fa01e
Image: 192.168.0.212:5000/nginx:1.13
Image ID: docker-pullable://192.168.0.212:5000/nginx@sha256:e4f0474a75c510f40b37b6b7dc2516241ffa8bde5a442bde3d372c9519c84d90
Port: 80/TCP
State: Running
Started: Sun, 27 Sep 2020 00:03:13 +0800
Ready: True
Restart Count: 0
Liveness: tcp-socket :80 delay=3s timeout=3s period=10s #success=1 #failure=3
Volume Mounts: <none>
Environment Variables: <none>
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
No volumes.
QoS Class: BestEffort
Tolerations: <none>
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
39s 39s 1 {default-scheduler } Normal Scheduled Successfully assigned liveness-tcpsocket to 192.168.0.212
39s 39s 1 {kubelet 192.168.0.212} spec.containers{nginx} Normal Pulling pulling image "192.168.0.212:5000/nginx:1.13"
39s 39s 1 {kubelet 192.168.0.212} spec.containers{nginx} Normal Pulled Successfully pulled image "192.168.0.212:5000/nginx:1.13"
39s 39s 1 {kubelet 192.168.0.212} spec.containers{nginx} Normal Created Created container with docker id edfb206ebf80; Security:[seccomp=unconfined]
39s 39s 1 {kubelet 192.168.0.212} spec.containers{nginx} Normal Started Started container with docker id edfb206ebf80
2.检测80端口
[root@kub_master healthy]# curl -I 172.16.81.3:80
HTTP/1.1 200 OK
Server: nginx/1.13.12
Date: Sat, 26 Sep 2020 16:08:46 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Mon, 09 Apr 2018 16:01:09 GMT
Connection: keep-alive
ETag: "5acb8e45-264"
Accept-Ranges: bytes
通过与容器内的localhost:80建立TCP连接进行健康检查
3. 参数说明
对于每种探测方式,都需要设置initialDelaySeconds和timeoutSeconds两个参数,它们的含义分别如下:
initialDelaySeconds:启动容器后首次健康检查的等待时间,单位为s
timeoutSeconds:健康检查发送请求后等待响应的超时时间,单位s。当超时发生时,kubelet会认为容器已经无法提供服务,将重启该容器。
六、readiness探针的httpGet使用
1.创建一个RC和其对应得Service服务
[root@kub_master healthy]# vim rc_nginx_readiness.yaml
[root@kub_master healthy]# cat rc_nginx_readiness.yaml
apiVersion: v1
kind: ReplicationController
metadata:
name: readiness-httpget
spec:
replicas: 2
selector:
app: readiness
template:
metadata:
labels:
app: readiness
spec:
containers:
- name: readiness
image: 192.168.0.212:5000/nginx:1.13
ports:
- containerPort: 80
readinessProbe:
httpGet:
path: /test.html
port: 80
initialDelaySeconds: 3
timeoutSeconds: 3
[root@kub_master healthy]# kubectl create -f rc_nginx_readiness.yaml
replicationcontroller "readiness-httpget" created
[root@kub_master healthy]# kubectl get rc -o wide
NAME DESIRED CURRENT READY AGE CONTAINER(S) IMAGE(S) SELECTOR
readiness-httpget 2 2 0 9s readiness 192.168.0.212:5000/nginx:1.13 app=readiness
[root@kub_master healthy]# kubectl expose rc readiness-httpget --port=80
service "readiness-httpget" exposed
[root@kub_master healthy]# kubectl get svc readiness-httpget
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
readiness-httpget 192.168.99.82 <none> 80/TCP 22s
2.查看service服务详细信息
[root@kub_master healthy]# kubectl describe svc readiness-httpget
Name: readiness-httpget
Namespace: default
Labels: app=readiness
Selector: app=readiness
Type: ClusterIP
IP: 192.168.99.82
Port: <unset> 80/TCP
Endpoints:
Session Affinity: None
No events.
[root@kub_master healthy]# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE
liveness-httpget 1/1 Running 1 40m 172.16.46.2 192.168.0.184
liveness-tcpsocket 1/1 Running 0 25m 172.16.81.3 192.168.0.212
readiness-httpget-5nlxb 0/1 Running 0 3m 172.16.81.4 192.168.0.212
readiness-httpget-xbsv7 0/1 Running 0 3m 172.16.46.3 192.168.0.184
发现readiness-httpget Service服务后端节点为空,且创建了2个pod
#进入pod容器中,创建访问页面
[root@kub_master healthy]# kubectl exec -it readiness-httpget-xbsv7 bash
root@readiness-httpget-xbsv7:/usr/share/nginx/html# echo "ok">test.html
root@readiness-httpget-xbsv7:/usr/share/nginx/html# exit
exit
#查看svc,发现后端节点ip
[root@kub_master healthy]# kubectl describe svc readiness-httpget
Name: readiness-httpget
Namespace: default
Labels: app=readiness
Selector: app=readiness
Type: ClusterIP
IP: 192.168.99.82
Port: <unset> 80/TCP
Endpoints: 172.16.46.3:80
Session Affinity: None
No events.