K8S实战:Centos7上集群部署
集群架构
k8s集群的架构
master节点:etcd,api-server,scheduler,controller-manager
node节点:kubelet,kube-proxy
etcd的作用:数据库
api-server:核心服务
controller-manager: 控制器管理 rc
scheduler: 创建新pod,选择合适的节点
kubelet: 调用docker来创建容器
kube-proxy: 对外提供用户访问,对内提供一个负载均衡器
##1:环境规划
节点 | ip地址 |
master | 10.200.18.100 |
Node1 | 10.200.18.101 |
Node2 | 10.200.18.102 |
操作系统版本 | |
[root@kmaster ~]# cat /etc/redhat-release | |
CentOS Linux release 7.4.1708 (Core) | |
[root@kmaster ~]# unam | |
/etc/hosts下增加如下内容 |
10.200.18.100 kmaster1
10.200.18.101 knode1
10.200.18.102 knode2
##2、初始化工作
#关闭防火墙
systemctl disable firewalld
systemctl stop firewalld
#关闭selinux
setenforce 0
#永久关闭
sed -i "s/SELINUX=enforcing/SELINUX=disabled/" /etc/sysconfig/selinux
sed -i "s/SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config
#禁用swap
swapoff -a
#永久禁用
sed -i "s/.*swap.*/#&/" /etc/fstab
#修改内核参数
cat <<EOF > /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
#重新加载配置文件
sysctl --system
#配置阿里k8s yum源
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
#更新缓存
yum clean all -y && yum makecache -y && yum repolist -y
##3、安装docker
参考XXX
master和node都需要安装
##4、部署master节点
针对master节点,安装kubectl,kubeadm,kubelet,以及flannel
yum -y install kubectl kubeadm kubelet
安装指定版本yum install kubeadm-1.16.0-0.x86_64
kubelet-1.16.0-0.x86_64
在初始化 k8s 时会去 k8s.gcr.io 拉取镜像,由于国内网络原因,会导致失败,而在 hub.docker.com
中有这些需要拉取镜像的 Copy,所以这里指定从阿里云镜像站拉取
初始化K8S主节点,master上执行
kubeadm init --image-repository registry.aliyuncs.com/google_containers --kubernetes-version=v1.16.0 --apiserver-advertise-address 10.200.18.100 --pod-network-cidr=10.244.0.0/16 --token-ttl 0
–kubernetes-version: 用于指定 k8s 版本。
–apiserver-advertise-address:用于指定使用 Master 的哪个 network interface 进行通信,若不指定,则 kubeadm 会自动选择具有默认网关的 interface。
–pod-network-cidr:用于指定Pod 的网络范围。该参数使用依赖于使用的网络方案,本文将使用经典的 flannel 网络方案。
初始化没有做好,kubeadm init 会出现报错,按照报错修改操作系统配置,后重新init
初始化过程耗时可能会久一些,这个步骤会从镜像仓库拉取master节点安装所需的组件,下载到本地并启动对应的服务(容器的方式)
[root@kmaster ~]# kubeadm init --image-repository registry.aliyuncs.com/google_containers --kubernetes-version=v1.16.0 --apiserver-advertise-address 10.200.18.100 --pod-network-cidr=10.244.0.0/16 --token-ttl 0
[init] Using Kubernetes version: v1.16.0
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Activating the kubelet service
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kmaster1 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 10.200.18.100]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [kmaster1 localhost] and IPs [10.200.18.100 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [kmaster1 localhost] and IPs [10.200.18.100 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
[apiclient] All control plane components are healthy after 50.514258 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.16" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node kmaster1 as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node kmaster1 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: m0xups.fjnd2kwr1lset2zz
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 10.200.18.100:6443 --token m0xups.fjnd2kwr1lset2zz \
--discovery-token-ca-cert-hash sha256:9d9bc1c620fcb77df61f0874f38a882504dbce9d7778e60f70519ac8671402ae
然后按照日志提示,还需要后续操作
master节点执行如下命令
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
安装网络插件
K8S提供的网络插件有很多,此处我们安装flannel
将kube-flannel.yml 文件上传到master节点,然后执行
kubectl apply -f kube-flannel.yml
[root@kmaster ~]# kubectl apply -f kube-flannel.yml
podsecuritypolicy.policy/psp.flannel.unprivileged created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.apps/kube-flannel-ds-amd64 created
daemonset.apps/kube-flannel-ds-arm64 created
daemonset.apps/kube-flannel-ds-arm created
daemonset.apps/kube-flannel-ds-ppc64le created
daemonset.apps/kube-flannel-ds-s390x created
slave从节点如下执行如下命令来加入master节点,请记录下来
kubeadm join 10.200.18.100:6443 --token m0xups.fjnd2kwr1lset2zz \
--discovery-token-ca-cert-hash sha256:9d9bc1c620fcb77df61f0874f38a882504dbce9d7778e60f70519ac8671402ae
##5、部署node节点
执行初始化参数配置,同主节点
安装组件
yum install kubelet-1.16.0-0.x86_64
yum install kubeadm-1.16.0-0.x86_64
启动服务
[root@knode1 yum.repos.d]# systemctl enable kubelet && systemctl start kubelet
Created symlink from /etc/systemd/system/multi-user.target.wants/kubelet.service to /usr/lib/systemd/system/kubelet.service.
最后将node节点加入集群
执行命令kubeadm join 10.200.18.100:6443 --token m0xups.fjnd2kwr1lset2zz
–discovery-token-ca-cert-hash sha256:9d9bc1c620fcb77df61f0874f38a882504dbce9d7778e60f70519ac8671402ae
[root@knode1 yum.repos.d]# kubeadm join 10.200.18.100:6443 --token m0xups.fjnd2kwr1lset2zz --discovery-token-ca-cert-hash sha256:9d9bc1c620fcb77df61f0874f38a882504dbce9d7778e60f70519ac8671402ae
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.16" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Activating the kubelet service
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
其余node节点参照node1进行安装
##6、检查集群状态
设置kubectl命令自动补全
yum install bash-completion -y
[root@kmaster ~]# source /usr/share/bash-completion/bash_completion
[root@kmaster ~]# source <(kubectl completion bash)
[root@kmaster ~]# echo "source <(kubectl completion bash)" >> ~/.bashrc
然后测试发现已经可以自动补全命令
[root@kmaster ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
kmaster1 Ready master 18h v1.16.0
knode1 Ready <none> 17h v1.16.0
knode2 Ready <none> 20m v1.16.0
可以看到集群状态已经是ready状态
如果有节点状态是notready,我们可以查看pod状态,检查是否有服务没有启动
[root@kmaster ~]# kubectl get pods -o wide --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system coredns-58cc8c89f4-5mv5p 1/1 Running 0 18h 10.244.0.3 kmaster1 <none> <none>
kube-system coredns-58cc8c89f4-f5s96 1/1 Running 0 18h 10.244.0.2 kmaster1 <none> <none>
kube-system etcd-kmaster1 1/1 Running 0 18h 10.200.18.100 kmaster1 <none> <none>
kube-system kube-apiserver-kmaster1 1/1 Running 0 18h 10.200.18.100 kmaster1 <none> <none>
kube-system kube-controller-manager-kmaster1 1/1 Running 0 18h 10.200.18.100 kmaster1 <none> <none>
kube-system kube-flannel-ds-amd64-g6jgz 1/1 Running 0 16h 10.200.18.100 kmaster1 <none> <none>
kube-system kube-flannel-ds-amd64-qvj2m 1/1 Running 1 18m 10.200.18.102 knode2 <none> <none>
kube-system kube-flannel-ds-amd64-sm5ng 1/1 Running 1 16h 10.200.18.101 knode1 <none> <none>
kube-system kube-proxy-k629s 1/1 Running 0 18h 10.200.18.100 kmaster1 <none> <none>
kube-system kube-proxy-p2h26 1/1 Running 2 16h 10.200.18.101 knode1 <none> <none>
kube-system kube-proxy-xrp6x 1/1 Running 0 18m 10.200.18.102 knode2 <none> <none>
kube-system kube-scheduler-kmaster1 1/1 Running 0 18h 10.200.18.100 kmaster1 <none> <none>
都是running表示pod创建正常
##7.问题处理合集
案例1:报错 [ERROR FileContent–proc-sys-net-ipv4-ip_forward]: /proc/sys/net/ipv4/ip_forward contents are not set to 1
/proc/sys/net/ipv4/ip_forward
该文件表示是否打开IP转发。
0,禁止
1,转发
基本用途:如VPN、路由产品的利用;
出于安全考虑,Linux系统默认是禁止数据包转发的。所谓转发即当主机拥有多于一块的网卡时,其中一块收到数据包,根据数据包的目的ip地址将包发往本机另一网卡,该网卡根据路由表继续发送数据包。这通常就是路由器所要实现的功能。
配置Linux系统的ip转发功能,首先保证硬件连通,然后打开系统的转发功能
less /proc/sys/net/ipv4/ip_forward,该文件内容为0,表示禁止数据包转发,1表示允许,将其修改为1。
可使用命令echo “1” > /proc/sys/net/ipv4/ip_forward 修改文件内容,重启网络服务或主机后效果不再。若要其自动执行,可将命令echo “1” > /proc/sys/net/ipv4/ip_forward 写入脚本/etc/rc.d/rc.local 或者 在/etc/sysconfig/network脚本中添加 FORWARD_IPV4=“YES”
报错2: [WARNING Hostname]: hostname “kmaster” could not be reached
主机名和/etc/hosts中设置的不一致
报错3:[ERROR Swap]: running with swap on is not supported. Please disable swap
需要关闭swap分区,执行命令swapoff -a
报错4:解决部分镜像google无法下载问题
ImagePullBackOff报错表示镜像下载失败
docker pull easzlab/flannel:v0.12.0-amd64
先手动从其他镜像仓库下载,然后修改标签,从节点也要下载这个镜像
docker tag easzlab/flannel:v0.12.0-amd64 quay-mirror.qiniu.com/coreos/flannel:v0.12.0-amd64
报错5:node节点kubelet服务无法启动
kubelet get nodes看到node01节点status为notready,并且版本也不对
查看日志
[root@kmaster ~]# kubectl describe nodes knode1
Name: knode1
Roles: <none>
Labels: kubernetes.io/arch=amd64
kubernetes.io/hostname=knode1
kubernetes.io/os=linux
Annotations: kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
node.alpha.kubernetes.io/ttl: 0
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Tue, 17 Nov 2020 17:09:12 +0800
Taints: node.kubernetes.io/unreachable:NoExecute
node.kubernetes.io/unreachable:NoSchedule
Unschedulable: false
Lease:
HolderIdentity: knode1
AcquireTime: <unset>
RenewTime: Tue, 17 Nov 2020 17:13:55 +0800
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
MemoryPressure Unknown Tue, 17 Nov 2020 17:13:36 +0800 Tue, 17 Nov 2020 17:14:36 +0800 NodeStatusUnknown Kubelet stopped posting node status.
DiskPressure Unknown Tue, 17 Nov 2020 17:13:36 +0800 Tue, 17 Nov 2020 17:14:36 +0800 NodeStatusUnknown Kubelet stopped posting node status.
PIDPressure Unknown Tue, 17 Nov 2020 17:13:36 +0800 Tue, 17 Nov 2020 17:14:36 +0800 NodeStatusUnknown Kubelet stopped posting node status.
Ready Unknown Tue, 17 Nov 2020 17:13:36 +0800 Tue, 17 Nov 2020 17:14:36 +0800 NodeStatusUnknown Kubelet stopped posting node status.
Addresses:
InternalIP: 10.200.18.101
从日志上可以看出node01的kubelet服务停止了,导致状态状态无法更新。