文章目录




1、调度方式产生背景


默认情况下,一个Pod在那个节点上运行是由Scheduler组件采用相应的算法计算出来的,这个过程是不应该受到人工控制的,但是在实际使用环境中,这并不能满足现在的需求,因为较多情况下,需要根据控制某些Pod执行到某些节点上就需要了解KUbernetes对Pod的调度规则,Kubernetes目前提供了四种调度方式:

  • 自动调度:运行在那个节点上完全由Scheduler经过一些列的计算得出;
  • 定向调度:NodeName、NodeSelector;
  • 亲和性调度:NodeAffinity、PodAffinity、PodAntiAffinity;
  • 污点与容忍:Taints、Toleration

2、定向调度


定向调度,指的是利用在Pod上声明nodeName或者nodeSelector,以此将Pod调度到期望的node节点上,注意这里的调度是强制性的,这就意味这即便调度目标node不存在,也会调度成功,只不过pod运行失败而已;

2.1 NodeName

NodeName用于强制约束将Pod调度到指定Name的node上,这种调度方式直接跳过了Scheduler组件的逻辑调度,直接写入PodList列表中,该匹配规则是强制约束;

#创建YAML文件
[root@master ~]# vim pod-nodename.yaml
apiVersion: v1
kind: Pod
metadata:
name: pod-nodename
namespace: dev
spec:
containers:
- name: nginx
image: nginx:1.17.1
ports:
- name: nginx-port
containerPort: 80
nodeName: node2.kubernetes #调度到node2
#调用YAML文件
[root@master ~]# kubectl apply -f pod-nodename.yaml
pod/pod-nodename created

#查看Pod调度Node
[root@master ~]# kubectl get pods -n dev -owide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod-nodeselector 1/1 Running 0 64s 10.244.80.81 node2.kubernetes <none> <none>

2.2 NodeSelector

NodeSelector用于将Pod调度到了指定标签的node节点上,它是通过Kubernetes的label-selector机制来实现的,也就是说需要在Pod创建之前由Scheduler使用MatchNodeSelector调度策略进行label匹配,找出目标node,然后将Pod调度到目标node上,该匹配规则也是强制约束;

#分别给node配置label
[root@master ~]# kubectl label nodes node1.kubernetes nodeenv=pro
node/node1.kubernetes labeled
[root@master ~]# kubectl label nodes node2.kubernetes nodeenv=test
node/node2.kubernetes labeled

#查看所配置的label
[root@master ~]# kubectl get nodes --show-labels
NAME STATUS ROLES AGE VERSION LABELS
master.kubernetes Ready control-plane,master 13d v1.23.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=master.kubernetes,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master=,node.kubernetes.io/exclude-from-external-load-balancers=
node1.kubernetes Ready <none> 13d v1.23.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=node1.kubernetes,kubernetes.io/os=linux,nodeenv=pro
node2.kubernetes Ready <none> 13d v1.23.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=node2.kubernetes,kubernetes.io/os=linux,nodeenv=test
#创建YAML文件
[root@master ~]# vim pod-nodeselector.yaml
apiVersion: v1
kind: Pod
metadata:
name: pod-nodeselector
namespace: dev
spec:
containers:
- name: nginx
image: nginx:1.17.1
ports:
- name: nginx-port
containerPort: 80
nodeSelector:
nodeenv: pro #指定调度到具有nodeenv=pro标签的节点上
#调用YAML文件
[root@master ~]# kubectl apply -f pod-nodeselector.yaml
pod/pod-nodeselector created

#查看调度节点是否为node1(node1的label为pro)
[root@master ~]# kubectl get pods -n dev -owide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod-nodeselector 1/1 Running 0 64s 10.244.80.81 node1.kubernetes <none> <none>

#修改调度label为其他值
[root@master ~]# kubectl delete -f pod-nodeselector.yaml
pod "pod-nodeselector" deleted

[root@master ~]# cat pod-nodeselector.yaml | grep nodeenv
nodeenv: pro1
[root@master ~]# kubectl apply -f pod-nodeselector.yaml
pod/pod-nodeselector created

#查看Pod状态是否为Running
[root@master ~]# kubectl get pod -n dev
NAME READY STATUS RESTARTS AGE
pod-nodeselector 0/1 Pending 0 9s

#查看报错原因
[root@master ~]# kubectl describe pod pod-nodeselector -n dev
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 41s default-scheduler 0/3 nodes are available: 1 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate, 2 node(s) didn't match Pod's node affinity/selector.
#默认情况下scheduler会将master也算作一个node节点,因此会显示default-scheduler 0/3 nodes

3、亲和性调度


上面介绍了两种定向调度的方式,使用起来非常方便,但是也有一定的问题,那就是如果没有满足条件的Node,那么Pod将不会被运行,即使在集群中还有可用Node列表也不行,这就是限制了它的使用场景;基于上面的问题,Kubernetes还提供了一种亲和性调度(Affinity),它在NodeSelector的基础之上的进行了扩展,可以通过配置的形式,实现优先选择满足条件的Node进行调度,如果没有,也可以调度到不满足的节点上,使调度更加灵活;

亲和性主要分为三类:

  • nodeAffinity(node亲和性):以node为目标,解决pod可以调度到哪些node的问题;
  • podAffinity(pod亲和性):以pod为目标,解决pod可以和哪些已存在的pod部署在同一拓扑中的问题;
  • podAntiAffinity(pod反亲和性):以pod为目标,解决pod不能和哪些已存在pod部署在同一个拓扑域中的问题;

调用规则主要分为两类:

  • 亲和性:如果两个应用频繁交互,那么就有必要使用亲和性让两个应用尽可能的靠近,这样就可以减少因网络通信而带来的性能损耗;
  • 反亲和性:当应用的采用多副本部署时,有必要采用反亲和性让各个应用实例打散分布在各个node上,这样可以提高服务的高可用性;

3.1 NodeAffinity

下面为NodeAffinity Node的相关配置内容:

spec:
affinity: #亲和性配置
nodeAffinity: #Node亲和性
requiredDuringSchedulingIgnoredDuringExecution: #Node节点必须满足指定规则才可以调度,硬限制
nodeSelectorTerms: #节点选择列表
matchFields: #按照节点字段列出的节点选择器要求列表
key: #键
values: #值(包含与不包含的关系符号配置后即不需要配置value,只作用于key)
operator: In,NotIn,Exists,DoesNotExist.Gt,Lt #关系符,选择内容分别为:在...里面,不在...里面,包含,不包含.大于/小于
matchExpressions: #按照节点标签列出的节点选择器要求列表(推荐使用)
key:
values:
operator: In,NotIn,Exists,DoesNotExist.Gt,Lt
preferredDuringSchedulingIgnoredDuringExecution: #Pod节点优先调度到指定规则的Node上,软限制(推荐使用)
preference: #一个节点选择器项,与相应的权重值互相关联
matchFields:
matchExpressions:
key:
values:
operator: In,NotIn,Exists,DoesNotExist.Gt,Lt
matchFields:
weight: #权重值,范围在1-100之间,定义不同Node的权重值
podAffinity: #Pod亲和性
podAntiAffinity: #Pod反亲和性

关系符号使用说明:

matchExpressions:
- key: nodedev
operator: Exists #匹配标签的key为nodeenv的所有节点
- key: nodedev
operator: In
values: ["xxx","yyy"] #匹配标签的key为nodeenv,值为xxx或yyy的节点
- key: nodedev
operator: Gt
values: "xxx" #匹配标签的key为nodeenv,值大于xxx的节点

3.1.1 Required 硬限制

#创建YAML文件
[root@master ~]# cat pod-node-affinity-required.yaml
apiVersion: v1
kind: Pod
metadata:
name: pod-node-affinity-required
namespace: dev
spec:
containers:
- name: nginx
image: nginx:1.17.1
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution: #硬限制
nodeSelectorTerms:
- matchExpressions: #匹配key为nodeenv,值为xxx或yyy的Node节点
- key: nodeenv
operator: In
values: ["xxx","yyy"]
#调用YAML文件
[root@master ~]# kubectl create -f pod-node-affinity-required.yaml
pod/pod-node-affinity-required created

#查看Pod创建状态
[root@master ~]# kubectl get pod -n dev -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod-node-affinity-required 0/1 Pending 0 64s <none> <none> <none> <none>

#Pod状态为Pending,查看告警日志信息
[root@master ~]# kubectl describe pod pod-node-affinity-required -n dev
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 28s (x3 over 2m37s) default-scheduler 0/3 nodes are available: 1 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate, 2 node(s) didn't match Pod's node affinity/selector. #由于未匹配key为nodeenv,值为xxx或yyy的Node节点,因此调度失败(硬限制)

#删除Pod
[root@master ~]# kubectl delete -f pod-node-affinity-required.yaml
pod "pod-node-affinity-required" deleted
#给node1 与node2分别打上pro、test标签
[root@master ~]# kubectl label node node1.k8s nodeenv=pro
node/node1.k8s labeled
[root@master ~]# kubectl label node node2.k8s nodeenv=test
node/node2.k8s labeled

#查看Node标签
[root@master ~]# kubectl get node --show-labels
NAME STATUS ROLES AGE VERSION LABELS
master.k8s Ready control-plane,master 22d v1.23.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=master.k8s,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master=,node.kubernetes.io/exclude-from-external-load-balancers=
node1.k8s Ready <none> 22d v1.23.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=node1.k8s,kubernetes.io/os=linux,nodeenv=pro
node2.k8s Ready <none> 22d v1.23.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=node2.k8s,kubernetes.io/os=linux,nodeenv=test

#将values: ["xxx","yyy"]修改为 values: ["pro","yyy"]
[root@master ~]# cat $_ | grep -w values
values: ["pro","yyy"]

#再次调用YAML文件
[root@master ~]# kubectl create -f pod-node-affinity-required.yaml
pod/pod-node-affinity-required created

#查看Pod调用是否在pro的Node1上
[root@master ~]# kubectl get pod -n dev -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod-node-affinity-required 1/1 Running 0 2m 10.244.112.20 node1.k8s <none> <none>

3.1.2 Preferred 软限制

#创建YAML文件
[root@master ~]# cat pod-node-affinity-preferred.yaml
apiVersion: v1
kind: Pod
metadata:
name: pod-node-affinity-preferred
namespace: dev
spec:
containers:
- name: nginx
image: nginx:1.17.1
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution: #软限制
- preference:
matchExpressions: #匹配key为nodeenv,值为xxx或yyy的Node节点(当前环境无此Node)
- key: nodeenv
operator: In
values: ["xxx","yyy"]
weight: 1
#调用YAML文件
[root@master ~]# kubectl apply -f pod-node-affinity-preferred.yaml
pod/pod-node-affinity-preferred created

#查看Pod状态
[root@master ~]# kubectl get pod -n dev -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod-node-affinity-preferred 1/1 Running 0 52s 10.244.112.21 node1.k8s <none> <none>

3.1.3 注意事项

  • 如果同时定义了nodeSelector(定向调度)和nodeAffinity(亲和性调度),那么必须两个条件都得到满足,Pod才能运行在指定的node上;
  • 如果nodeAffinity指定了多个nodeSelectorTerms,那么只需要其中一个能够匹配成功即可;
  • 如果一个nodeSelectorTerms中有多个matchExpressions,则一个节点必须满足所有的才能匹配成功;
  • 如果一个Pod所在的Node在Pod运行期间其标签发生了变化,不再符合该Pod的节点亲和性需求,则系统将忽略此变化;

3.2 PodAffinity

podAffinity主要实现以运行Pod为参照,实现让创建的Pod跟参照Pod在一个区域的功能;
下面为podAffinity的相关配置内容:

spec:
affinity: #亲和性配置
nodeAffinity: #Node亲和性
podAffinity: #Pod亲和性
requiredDuringSchedulingIgnoredDuringExecution: #Node节点必须满足指定规则才可以调度,硬限制
namespace: #指定参照Pod的命名空间
topologyKey: #指定调度作用域
labelSelector: #标签选择器
matchExpressions: #按照节点标签列出的节点选择器要求列表(推荐使用)
key: #键
values: #值(包含与不包含的关系符号配置后即不需要配置value,只作用于key)
operator: In,NotIn,Exists,DoesNotExist.Gt,Lt #关系符号
matchLabels: #指定多个matchExpressions的映射内容
preferredDuringSchedulingIgnoredDuringExecution: #Pod节点优先调度到指定规则的Node上,软限制(推荐使用)
podAffinityTerm: #一个Pod选择器,与相应的权重值互相关联
namespace:
topologyKey:
labelSelector:
matchExpressions:
key:
values:
operator:
matchLabels:
weight: #权重值,范围在1-100之间
podAntiAffinity: #Pod反亲和性

topologyKey用于指定调度时作用域,如下

  • 如果指定为kubernetes.io/hostname,那就是以Node节点为区分范围;
  • 如果指定为beta.kubernetes.io/os,则以Node节点的操作系统类型来区分;

3.2.1 测试环境准备

此参照Pod只是为了完成podAffinity与podAntiAffinity实验而创建,无其他实际意义;

#创建一个参照Pod,标签为podenv=pod
[root@master ~]# cat pod-pod-affinity-target.yaml
apiVersion: v1
kind: Pod
metadata:
name: pod-pod-affinity-target
namespace: dev
labels:
podenv: pro #设置标签
spec:
containers:
- name: nginx
image: nginx:1.17.1
nodeName: node1.k8s #将Pod分发在node1上
#调用YAML文件
[root@master ~]# kubectl apply -f pod-pod-affinity-target.yaml
pod/pod-pod-affinity-target created

#查看Pod部署Node与标签信息
[root@master ~]# kubectl get pod -n dev --show-labels -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES LABELS
pod-pod-affinity-target 1/1 Running 0 65s 10.244.112.23 node1.k8s <none> <none> podenv=pro

3.2.2 Required 硬限制

#创建YAML文件
[root@master ~]# cat pod-pod-affinity-required.yaml
apiVersion: v1
kind: Pod
metadata:
name: pod-pod-affinity-required
namespace: dev
spec:
containers:
- name: nginx
image: nginx:1.17.1
affinity:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution: #硬限制
- labelSelector: #标签选择器
matchExpressions: #按照节点标签列出的节点选择器选择对应Node
- key: podenv #匹配key为nodeenv,值为xxx或yyy的Pod所在的节点
operator: In
values: ["xxx","yyy"]
topologyKey: kubernetes.io/hostname #以Node为区分范围
#调用YAML文件
[root@master ~]# kubectl apply -f pod-pod-affinity-required.yaml
pod/pod-pod-affinity-required created

#查看Pod状态
[root@master ~]# kubectl get pod -n dev -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod-pod-affinity-required 0/1 Pending 0 65s <none> <none> <none> <none>

#查看告警信息
[root@master ~]# kubectl describe pod pod-pod-affinity-required -n dev
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 88s default-scheduler 0/3 nodes are available: 1 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate, 2 node(s) didn't match pod affinity rules.

#删除Pod
[root@master ~]# kubectl delete -f pod-pod-affinity-required.yaml
pod "pod-pod-affinity-required" deleted
#更改YAML文件中values: ["xxx","yyy"]为values: ["pod","yyy"]
[root@master ~]# cat $_ | grep -w values
values: ["pod","yyy"]
[root@master ~]#

#再次调用YAML文件
[root@master ~]# kubectl apply -f pod-pod-affinity-required.yaml
pod/pod-pod-affinity-required created

#查看创建结果
[root@master ~]# kubectl get pod -n dev --show-labels -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES LABELS
pod-pod-affinity-required 1/1 Running 0 14s 10.244.112.24 node1.k8s <none> <none> <none>
pod-pod-affinity-target 1/1 Running 0 2m30s 10.244.112.23 node1.k8s <none> <none> podenv=pro

#删除硬限制Pod
[root@master ~]# kubectl delete -f pod-pod-affinity-required.yaml
pod "pod-pod-affinity-required" deleted

3.2.3 Preferred 软限制

#创建YAML文件
[root@master ~]# cat pod-pod-affinity-preferred.yaml
apiVersion: v1
kind: Pod
metadata:
name: pod-pod-affinity-preferred
namespace: dev
spec:
containers:
- name: nginx
image: nginx:1.17.1
affinity:
podAffinity:
preferredDuringSchedulingIgnoredDuringExecution: #软限制
- podAffinityTerm:
topologyKey: kubernetes.io/hostname #以Node为区分范围
labelSelector:
matchExpressions: #按照节点标签列出的节点选择器选择对应Node
- key: podenv #匹配key为nodeenv,值为xxx或yyy的Pod所在的节点
operator: In
values: ["pod","yyy"]
weight: 1
#调用YAML文件
[root@master ~]# kubectl apply -f pod-pod-affinity-preferred.yaml
pod/pod-pod-affinity-preferred created

#查看Pod状态
[root@master ~]# kubectl get pod -n dev -owide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod-pod-affinity-preferred 1/1 Running 0 95s 10.244.112.25 node1.k8s <none> <none>

3.3 PodAntiAffinity

PodAntiAffinity与PodAffinity是一样的,配置参数不做赘述;
(反亲和性创建的新Pod需要与匹配到的Pod分布在不同的Node上)

3.3.1 Required 硬限制

#创建YAML文件
[root@master ~]# cat pod-podantiaffinity-required.yaml
apiVersion: v1
kind: Pod
metadata:
name: pod-podantiaffinity-required
namespace: dev
spec:
containers:
- name: nginx
image: nginx:1.17.1
affinity:
podAntiAffinity: #反亲和性
requiredDuringSchedulingIgnoredDuringExecution: #硬限制
- labelSelector:
matchExpressions: #匹配key为podenv,值为pro的pod
- key: podenv
operator: In
values: ["pro"]
topologyKey: kubernetes.io/hostname
#调用YAML文件
[root@master ~]# kubectl apply -f pod-podantiaffinity-required.yaml
pod/pod-podantiaffinity-required created

#查看Pod调度在Node2节点上
[root@master ~]# kubectl get pod -n dev -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod-pod-affinity-target 1/1 Running 0 6m6s 10.244.112.26 node1.k8s <none> <none>
pod-podantiaffinity-required 1/1 Running 0 85s 10.244.166.146 node2.k8s <none> <none>

3.3.2 Preferred 软限制

#创建YAML文件
[root@master ~]# cat pod-podantiaffinity-preferred.yaml
apiVersion: v1
kind: Pod
metadata:
name: pod-podantiaffinity-preferred
namespace: dev
spec:
containers:
- name: nginx
image: nginx:1.17.1
affinity:
podAntiAffinity: #反亲和性
preferredDuringSchedulingIgnoredDuringExecution: #软限制
- podAffinityTerm:
topologyKey: kubernetes.io/hostname #以Node为区分范围
labelSelector:
matchExpressions: #按照节点标签列出的节点选择器选择对应Node
- key: podenv #匹配key为nodeenv,值为xxx或yyy的Pod所在的节点
operator: In
values: ["pro","yyy"]
weight: 1
#调用YAML文件
[root@master ~]# kubectl apply -f pod-podantiaffinity-preferred.yaml
pod/pod-podantiaffinity-preferred created

#查看Pod调度在Node2节点上
[root@master ~]# kubectl get pod -n dev -owide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod-pod-affinity-target 1/1 Running 0 12m 10.244.112.26 node1.k8s <none> <none>
pod-podantiaffinity-preferred 1/1 Running 0 45s 10.244.166.147 node2.k8s <none> <none>

4、污点与容忍


在前面的告警内容中出现了污点 taint,此章节即讲解污点与容忍在Kubernetes中的作用;

Message:0/3 nodes are available: 1 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate, 2 node(s) didn't match pod affinity rules.

Kubernetes默认情况下会给master所在节点配置一个NoSchedule的污点,因为在系统创建之后,master将不再接受Pod调度到该节点上;

[root@master ~]# kubectl describe node master | grep -w Taints
Taints: node-role.kubernetes.io/master:NoSchedule

4.1 污点

前面的调度方式都是站在Pod的角度上,通过在Pod上添加属性,来确定Pod是否允许调度到指定的Node上,其实我们也可以站在Node的角度上,通过Node上添加污点属性,来决定是否允许Pod调度过来;
Node被设置上污点之后就和Pod之间存在了一种互相排斥的关系,拒绝Pod调度,甚至可以将已经存在的Pod驱逐出去,污点的配置格式为
​​​key=value:effect​

key和value是污点的标签,effect描述污点的作用,支持如下三个选项:

  • PreferNoSchedule:Kubernetes将进来避免Pod调度到具有该污点的Node上,除非没有其他节点可以调度;
  • NoSchedule:Kubernetes将不会把Pod调度到具有该污点的Node上,但是不会影响当前Node上已经存在的Pod;
  • NoExecute:Kubernetes将不会把Pod调度到具有该污点的Node上,同时也会将Node上已经存在的Pod进行驱离;

污点配置命令格式:

#设置污点
[root@master ~]# kubectl taint nodes node1 key=value:effect
#去除污点
[root@master ~]# kubectl taint nodes node1 key:effect-
#去除所有污点
[root@master ~]# kubectl taint nodes node1 key-

4.1.1 污点演示案例

准备节点node1(为了使得效果明显,推荐关闭node2)
为node1配置一个污点 tag=test:PreferNoSchedule,然后创建Pod1;修改node1污点为tag=test:NoSchedule,然后创建Pod2;修改node1污点为tag=test:NoExecute,然后创建Pod3;

#准备演示环境(关闭node2)
[root@master ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
master.k8s Ready control-plane,master 23d v1.23.0
node1.k8s Ready <none> 23d v1.23.0
node2.k8s NotReady <none> 23d v1.23.0

#为node1设置污点PreferNoSchedule
[root@master ~]# kubectl taint nodes node1.k8s tag=test:PreferNoSchedule
node/node1.k8s tainted
[root@master ~]# kubectl describe node node1.k8s | grep Taint
Taints: tag=test:PreferNoSchedule

#创建Pod1
[root@master ~]# kubectl run taint1 --image=nginx:1.17.1 -n dev
pod/taint1 created
[root@master ~]# kubectl get pod -n dev -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
taint1 1/1 Running 0 10s 10.244.112.28 node1.k8s <none> <none>

#为node1设置污点为NoSchedule
[root@master ~]# kubectl taint nodes node1.k8s tag:PreferNoSchedule-
node/node1.k8s untainted
[root@master ~]# kubectl taint nodes node1.k8s tag=test:NoSchedule
node/node1.k8s tainted

#创建Pod2
[root@master ~]# kubectl run taint2 --image=nginx:1.17.1 -n dev
pod/taint2 created
[root@master ~]# kubectl get pod -n dev -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
taint1 1/1 Running 0 2m54s 10.244.112.28 node1.k8s <none> <none>
taint2 0/1 Pending 0 9s <none> <none> <none> <none>

#为node1设置污点为NoExecute
[root@master ~]# kubectl taint nodes node1.k8s tag:NoSchedule-
node/node1.k8s untainted
[root@master ~]# kubectl taint nodes node1.k8s tag=test:NoExecute
node/node1.k8s tainted

#创建Pod3
[root@master ~]# kubectl run taint3 --image=nginx:1.17.1 -n dev
pod/taint3 created
[root@master ~]# kubectl get pod -n dev -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
taint1 0/1 Pending 0 55s <none> <none> <none> <none>
taint2 0/1 Pending 0 55s <none> <none> <none> <none>
taint3 0/1 Pending 0 12s <none> <none> <none> <none>

4.2 容忍

上面介绍了污点的作用,可以在Node上添加污点用于拒绝Pod调度,但是如何要将一个Pod调度到有污点配置的Node上去呢,此时就需要用到容忍;
容忍配置参数:

[root@master ~]# kubectl explain pod.spec.tolerations
KIND: Pod
VERSION: v1
FIELDS:
effect <string> #对应污点配置的effect
key <string> #对应需要容忍污点的key
operator <string> #key-value的运算符,支持Equal和Exists(默认)
tolerationSeconds <integer> #容忍时间,当effect为NoExecute时生效,表示Pod在Node上的停留时间
value <string> #对应需要容忍污点的value

4.2.1 容忍演示案例

准备一个被打上NoExecute污点的Node,可以直接使用上面的实验环境,最终node1的污点为NoExecute;

#确认node1污点为NoExecute
[root@master ~]# kubectl describe nodes node1.k8s | grep Taint
Taints: tag=test:NoExecute

#创建YAML文件
[root@master ~]# cat pod-toleration.yaml
apiVersion: v1
kind: Pod
metadata:
name: pod-toleration
namespace: dev
spec:
containers:
- name: nginx
image: nginx:1.17.1
tolerations: #添加容忍
- key: "tag" #key匹配值
operator: "Equal" #关系符号,等于
value: "test" #value匹配值
effect: "NoExecute" #规则匹配,同污点的effect
tolerationSeconds: 300 #容忍时间,300s
#调用YAML文件
[root@master ~]# kubectl apply -f pod-toleration.yaml
pod/pod-toleration created

#查看Pod状态
[root@master ~]# kubectl get pods -n dev -o wide -w
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod-toleration 1/1 Running 0 52s 10.244.112.31 node1.k8s <none> <none>
pod-toleration 1/1 Terminating 0 5m 10.244.112.31 node1.k8s <none> <none>