deployment调度
deployment或者RC控制器他们的调度规则都是由系统自动完成调度的,他们各自最终运行在哪个节点上,完全由master节点的scheduler经过一系列的算法计算得出,用户无法干预调度过程和结果,这里不在演示!!
NodeSelector定向调度
在实际生产环境中,有可能我们需要某pod运行在特定的节点之下,这时我们就需要定向调度,让某一pod运行在特定的node2节点下,步骤如下:
第一步,给node2节点打赏标签
[root@master ~]# kubectl labels node node2 app=release
可以用kubectl get nodes --show-labels查看标签
第二步,定义pod.yaml文件,定义selector让3个pod都运行在标签为app:release的节点之上
[root@master ~]# vim deploy.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
namespace: default
spec:
replicas: 3
selector:
matchLabels:
app: release
template:
metadata:
name: mypod
namespace: default
labels:
app: release
spec:
nodeSelector:
app: release
containers:
- name: mycontainer
image: liwang7314/myapp:v1
imagePullPolicy: IfNotPresent
ports:
- name: http
containerPort: 80
第三步,执行文件创建pod并观察pod运行的节点,我们发现3个pod全部运行在node2节点上
[root@master ~]# kubectl create -f deploy.yaml
deployment.apps/myapp created
[root@master ~]# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
myapp-95ff9459c-8g6tc 1/1 Running 0 4s 10.244.2.8 node2 <none> <none>
myapp-95ff9459c-ghxxx 1/1 Running 0 4s 10.244.2.7 node2 <none> <none>
myapp-95ff9459c-s5pt9 1/1 Running 0 4s 10.244.2.6 node2 <none> <none>
NodeAffinity:Node亲和性调度
NodeAffinity意味Node亲和性的调度策略,是用于替换NodeSelector的全信调度策略,目前有两种节点亲和性的表达
- requiredDuringSchedulingIgnoredDuringExecution:必须满足指定的规则才可以调度Pod到Node上(功能与NodeSelector类似,但是使用的语法不通),相当于硬限制
- preferredDuringSchedulingIgnoredDuringExecution:强调优先满足指定规则,调度器会尝试调度Pod到Node上,但并不强求,相当于软限制,多个优先级规则还可以设置权重(weight)值,以定义执行的先后顺序
我们定义一个affinity.yaml文件,里面定义一个3个pod,定义requiredDuringSchedulingIgnoredDuringExecution让pod调度至node2上,然后在定义preferredDuringSchedulingIgnoredDuringExecution不让pod调度至node2节点上,如下:
第一步,定义yaml文件
[root@master ~]# cat affinity.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
namespace: default
spec:
replicas: 3
selector:
matchLabels:
app: release
template:
metadata:
name: mypod
namespace: default
labels:
app: release
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: app
operator: In
values:
- release
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: app
operator: NotIn
values:
- release
containers:
- name: mycontainer
image: liwang7314/myapp:v1
imagePullPolicy: IfNotPresent
ports:
- name: http
containerPort: 80
第二步,创建pod然后观察pod所在节点,可以发现全部调度至node2上
[root@master ~]# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
myapp-6fcfb98879-566z5 1/1 Running 0 3m24s 10.244.2.10 node2 <none> <none>
myapp-6fcfb98879-5r6cm 1/1 Running 0 3m24s 10.244.2.11 node2 <none> <none>
myapp-6fcfb98879-7kwwq 1/1 Running 0 3m24s 10.244.2.9 node2 <none> <none>
如果我们把requiredDuringSchedulingIgnoredDuringExecution去掉,在查看,可以发现已经调度至node1,但是node2也会有一个pod,因为preferredDuringSchedulingIgnoredDuringExecution字段是尽可能,并非必须,因此结果如下
[root@master ~]# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
myapp-55679db465-h5ftb 1/1 Running 0 11s 10.244.1.76 node1 <none> <none>
myapp-55679db465-kj58w 1/1 Running 0 11s 10.244.1.75 node1 <none> <none>
myapp-55679db465-mwfjq 1/1 Running 0 11s 10.244.2.12 node2 <none> <none>
NodeAffinity规则设置的注意事项:
- 如果同时定义了nodeSelector和nodeAffinity,那么必须两个条件都得到满足,Pod才能最终运行到指定Node上
- 如果nodeAffinity制定了多个nodeSelectorTerms,那么其中一个能够匹配上即可成功调度
- 如果在nodeSelectorTerms中有多个matchExpressions,则一个节点必须满足所有的matchExpressions才能运行该Pod
Taints和Tolerations(污点和容忍)
前面介绍了NodeAffinity节点亲和性,是在pod上定义的一种属性,使得Pod能够被调度到某些Node节点上运行(优先选择或强制要求)Taint则正好相反,它让Node拒绝Pod运行。
Taint要与Toleration配合使用,让Pod避开那些不适合的Node,在Node上设置一个或者多个Taint之后,除非Pod明确生命能够容忍这些污点,否则无法在这写Node节点上运行,TOolerations是Pod属性,让Pod能够(注意,只是能够,而非必须)运行标注了Taint的Node上
可以用以下方式设置污点,使其pod不在该节点运行
[root@master ~]# kubectl taint node node2 app=release:NoSchedule
node/node2 tainted
然后在Pod上生命Toleration
[root@master ~]# cat pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: mypod
namespace: default
spec:
tolerations:
- key: "app"
operator: "Equal"
value: "release"
effect: "NoSchedule"
containers:
- name: mycontainer
image: liwang7314/myapp:v1
imagePullPolicy: IfNotPresent
查看该pod运行的节点位置发现还是可以被调度到node2上的,因为我们容忍这个污点
[root@master ~]# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
mypod 1/1 Running 0 72s 10.244.2.14 node2 <none> <none>
此外,我们也可以给node节点打上NoEXecute污点,这样,tolerations没有定义容忍的污点的pod会被驱除,以后的Pod也不会在该节点运行,我们先看一下该节点状态
[root@master ~]# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
mypod 0/1 ContainerCreating 0 2s <none> node1 <none> <none>
然后在该节点上打上NoExecute污点,然后在观察,发现pod已经被驱除
[root@master ~]# kubectl taint node node1 app=release:NoExecute
node/node1 tainted
[root@master ~]# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
mypod 1/1 Terminating 0 106s 10.244.1.82 node1 <none> <none>
[root@master ~]# kubectl get pods -o wide
No resources found in default namespace.
如果要删除taint,只需要在后面加上“-”符号就可以了
[root@master ~]# kubectl taint node node1 app-
node/node1 untainted
注意点:
- operator的值可以是Exists(无需指定value)
- operator的值可以是Equal并且与value相等
- 如果不指定operator的值默认是Equal
- effect的取值为NoSchedule,也可以是PreferNoSchedule,这个值的意思是优先,也可以是NoExecute
- 污点和容忍可以定义多个