K8S实战:Centos7上集群部署

集群架构

k8s集群为什么要先安装docker k8s集群安装 centos7 简书_云原生

k8s集群的架构
master节点:etcd,api-server,scheduler,controller-manager
node节点:kubelet,kube-proxy

etcd的作用:数据库
api-server:核心服务
controller-manager: 控制器管理 rc
scheduler: 创建新pod,选择合适的节点

kubelet: 调用docker来创建容器
kube-proxy: 对外提供用户访问,对内提供一个负载均衡器

##1:环境规划

k8s集群为什么要先安装docker k8s集群安装 centos7 简书_云原生_02

节点

ip地址

master

10.200.18.100

Node1

10.200.18.101

Node2

10.200.18.102

操作系统版本

[root@kmaster ~]# cat /etc/redhat-release

CentOS Linux release 7.4.1708 (Core)

[root@kmaster ~]# unam

/etc/hosts下增加如下内容

10.200.18.100 kmaster1
10.200.18.101 knode1
10.200.18.102 knode2

##2、初始化工作
#关闭防火墙
systemctl disable firewalld
systemctl stop firewalld

#关闭selinux
setenforce 0
#永久关闭

sed -i "s/SELINUX=enforcing/SELINUX=disabled/" /etc/sysconfig/selinux
sed -i "s/SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config

#禁用swap
swapoff -a
#永久禁用

sed -i "s/.*swap.*/#&/" /etc/fstab

#修改内核参数

cat <<EOF >  /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF

#重新加载配置文件
sysctl --system

#配置阿里k8s yum源

cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF

#更新缓存

yum clean all -y && yum makecache -y && yum repolist -y

##3、安装docker
参考XXX
master和node都需要安装

##4、部署master节点
针对master节点,安装kubectl,kubeadm,kubelet,以及flannel
yum -y install kubectl kubeadm kubelet
安装指定版本yum install kubeadm-1.16.0-0.x86_64
kubelet-1.16.0-0.x86_64

k8s集群为什么要先安装docker k8s集群安装 centos7 简书_kubernetes_03

在初始化 k8s 时会去 k8s.gcr.io 拉取镜像,由于国内网络原因,会导致失败,而在 hub.docker.com
中有这些需要拉取镜像的 Copy,所以这里指定从阿里云镜像站拉取

初始化K8S主节点,master上执行
kubeadm init --image-repository registry.aliyuncs.com/google_containers --kubernetes-version=v1.16.0 --apiserver-advertise-address 10.200.18.100 --pod-network-cidr=10.244.0.0/16 --token-ttl 0

–kubernetes-version: 用于指定 k8s 版本。
–apiserver-advertise-address:用于指定使用 Master 的哪个 network interface 进行通信,若不指定,则 kubeadm 会自动选择具有默认网关的 interface。
–pod-network-cidr:用于指定Pod 的网络范围。该参数使用依赖于使用的网络方案,本文将使用经典的 flannel 网络方案。

初始化没有做好,kubeadm init 会出现报错,按照报错修改操作系统配置,后重新init
初始化过程耗时可能会久一些,这个步骤会从镜像仓库拉取master节点安装所需的组件,下载到本地并启动对应的服务(容器的方式)

[root@kmaster ~]# kubeadm init --image-repository registry.aliyuncs.com/google_containers --kubernetes-version=v1.16.0 --apiserver-advertise-address  10.200.18.100  --pod-network-cidr=10.244.0.0/16 --token-ttl 0
[init] Using Kubernetes version: v1.16.0
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Activating the kubelet service
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kmaster1 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 10.200.18.100]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [kmaster1 localhost] and IPs [10.200.18.100 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [kmaster1 localhost] and IPs [10.200.18.100 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
[apiclient] All control plane components are healthy after 50.514258 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.16" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node kmaster1 as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node kmaster1 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: m0xups.fjnd2kwr1lset2zz
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 10.200.18.100:6443 --token m0xups.fjnd2kwr1lset2zz \
    --discovery-token-ca-cert-hash sha256:9d9bc1c620fcb77df61f0874f38a882504dbce9d7778e60f70519ac8671402ae

然后按照日志提示,还需要后续操作
master节点执行如下命令

mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

安装网络插件
K8S提供的网络插件有很多,此处我们安装flannel
将kube-flannel.yml 文件上传到master节点,然后执行

kubectl apply -f kube-flannel.yml 
[root@kmaster ~]# kubectl apply -f kube-flannel.yml 
podsecuritypolicy.policy/psp.flannel.unprivileged created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.apps/kube-flannel-ds-amd64 created
daemonset.apps/kube-flannel-ds-arm64 created
daemonset.apps/kube-flannel-ds-arm created
daemonset.apps/kube-flannel-ds-ppc64le created
daemonset.apps/kube-flannel-ds-s390x created

slave从节点如下执行如下命令来加入master节点,请记录下来

kubeadm join 10.200.18.100:6443 --token m0xups.fjnd2kwr1lset2zz \
    --discovery-token-ca-cert-hash sha256:9d9bc1c620fcb77df61f0874f38a882504dbce9d7778e60f70519ac8671402ae

##5、部署node节点
执行初始化参数配置,同主节点
安装组件
yum install kubelet-1.16.0-0.x86_64
yum install kubeadm-1.16.0-0.x86_64
启动服务

[root@knode1 yum.repos.d]# systemctl enable kubelet && systemctl start kubelet
Created symlink from /etc/systemd/system/multi-user.target.wants/kubelet.service to /usr/lib/systemd/system/kubelet.service.

最后将node节点加入集群
执行命令kubeadm join 10.200.18.100:6443 --token m0xups.fjnd2kwr1lset2zz
–discovery-token-ca-cert-hash sha256:9d9bc1c620fcb77df61f0874f38a882504dbce9d7778e60f70519ac8671402ae

[root@knode1 yum.repos.d]# kubeadm join 10.200.18.100:6443 --token m0xups.fjnd2kwr1lset2zz     --discovery-token-ca-cert-hash sha256:9d9bc1c620fcb77df61f0874f38a882504dbce9d7778e60f70519ac8671402ae 
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.16" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Activating the kubelet service
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

k8s集群为什么要先安装docker k8s集群安装 centos7 简书_bc_04

其余node节点参照node1进行安装
##6、检查集群状态
设置kubectl命令自动补全
yum install bash-completion -y

[root@kmaster ~]# source /usr/share/bash-completion/bash_completion
[root@kmaster ~]# source <(kubectl completion bash)
[root@kmaster ~]# echo "source <(kubectl completion bash)" >> ~/.bashrc

然后测试发现已经可以自动补全命令

[root@kmaster ~]# kubectl get nodes
NAME       STATUS   ROLES    AGE   VERSION
kmaster1   Ready    master   18h   v1.16.0
knode1     Ready    <none>   17h   v1.16.0
knode2     Ready    <none>   20m   v1.16.0

可以看到集群状态已经是ready状态
如果有节点状态是notready,我们可以查看pod状态,检查是否有服务没有启动

[root@kmaster ~]# kubectl get pods -o wide --all-namespaces 
NAMESPACE     NAME                               READY   STATUS    RESTARTS   AGE   IP              NODE       NOMINATED NODE   READINESS GATES
kube-system   coredns-58cc8c89f4-5mv5p           1/1     Running   0          18h   10.244.0.3      kmaster1   <none>           <none>
kube-system   coredns-58cc8c89f4-f5s96           1/1     Running   0          18h   10.244.0.2      kmaster1   <none>           <none>
kube-system   etcd-kmaster1                      1/1     Running   0          18h   10.200.18.100   kmaster1   <none>           <none>
kube-system   kube-apiserver-kmaster1            1/1     Running   0          18h   10.200.18.100   kmaster1   <none>           <none>
kube-system   kube-controller-manager-kmaster1   1/1     Running   0          18h   10.200.18.100   kmaster1   <none>           <none>
kube-system   kube-flannel-ds-amd64-g6jgz        1/1     Running   0          16h   10.200.18.100   kmaster1   <none>           <none>
kube-system   kube-flannel-ds-amd64-qvj2m        1/1     Running   1          18m   10.200.18.102   knode2     <none>           <none>
kube-system   kube-flannel-ds-amd64-sm5ng        1/1     Running   1          16h   10.200.18.101   knode1     <none>           <none>
kube-system   kube-proxy-k629s                   1/1     Running   0          18h   10.200.18.100   kmaster1   <none>           <none>
kube-system   kube-proxy-p2h26                   1/1     Running   2          16h   10.200.18.101   knode1     <none>           <none>
kube-system   kube-proxy-xrp6x                   1/1     Running   0          18m   10.200.18.102   knode2     <none>           <none>
kube-system   kube-scheduler-kmaster1            1/1     Running   0          18h   10.200.18.100   kmaster1   <none>           <none>

都是running表示pod创建正常

##7.问题处理合集

案例1:报错 [ERROR FileContent–proc-sys-net-ipv4-ip_forward]: /proc/sys/net/ipv4/ip_forward contents are not set to 1

/proc/sys/net/ipv4/ip_forward
该文件表示是否打开IP转发。
0,禁止
1,转发
基本用途:如VPN、路由产品的利用;

出于安全考虑,Linux系统默认是禁止数据包转发的。所谓转发即当主机拥有多于一块的网卡时,其中一块收到数据包,根据数据包的目的ip地址将包发往本机另一网卡,该网卡根据路由表继续发送数据包。这通常就是路由器所要实现的功能。

配置Linux系统的ip转发功能,首先保证硬件连通,然后打开系统的转发功能

less /proc/sys/net/ipv4/ip_forward,该文件内容为0,表示禁止数据包转发,1表示允许,将其修改为1。

可使用命令echo “1” > /proc/sys/net/ipv4/ip_forward 修改文件内容,重启网络服务或主机后效果不再。若要其自动执行,可将命令echo “1” > /proc/sys/net/ipv4/ip_forward 写入脚本/etc/rc.d/rc.local 或者 在/etc/sysconfig/network脚本中添加 FORWARD_IPV4=“YES”

报错2: [WARNING Hostname]: hostname “kmaster” could not be reached
主机名和/etc/hosts中设置的不一致

报错3:[ERROR Swap]: running with swap on is not supported. Please disable swap
需要关闭swap分区,执行命令swapoff -a

报错4:解决部分镜像google无法下载问题

ImagePullBackOff报错表示镜像下载失败

k8s集群为什么要先安装docker k8s集群安装 centos7 简书_容器_05

docker pull easzlab/flannel:v0.12.0-amd64
先手动从其他镜像仓库下载,然后修改标签,从节点也要下载这个镜像
docker tag easzlab/flannel:v0.12.0-amd64 quay-mirror.qiniu.com/coreos/flannel:v0.12.0-amd64

报错5:node节点kubelet服务无法启动

kubelet get nodes看到node01节点status为notready,并且版本也不对

k8s集群为什么要先安装docker k8s集群安装 centos7 简书_云原生_06


查看日志

[root@kmaster ~]# kubectl describe nodes knode1
Name:               knode1
Roles:              <none>
Labels:             kubernetes.io/arch=amd64
                    kubernetes.io/hostname=knode1
                    kubernetes.io/os=linux
Annotations:        kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
                    node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Tue, 17 Nov 2020 17:09:12 +0800
Taints:             node.kubernetes.io/unreachable:NoExecute
                    node.kubernetes.io/unreachable:NoSchedule
Unschedulable:      false
Lease:
  HolderIdentity:  knode1
  AcquireTime:     <unset>
  RenewTime:       Tue, 17 Nov 2020 17:13:55 +0800
Conditions:
  Type             Status    LastHeartbeatTime                 LastTransitionTime                Reason              Message
  ----             ------    -----------------                 ------------------                ------              -------
  MemoryPressure   Unknown   Tue, 17 Nov 2020 17:13:36 +0800   Tue, 17 Nov 2020 17:14:36 +0800   NodeStatusUnknown   Kubelet stopped posting node status.
  DiskPressure     Unknown   Tue, 17 Nov 2020 17:13:36 +0800   Tue, 17 Nov 2020 17:14:36 +0800   NodeStatusUnknown   Kubelet stopped posting node status.
  PIDPressure      Unknown   Tue, 17 Nov 2020 17:13:36 +0800   Tue, 17 Nov 2020 17:14:36 +0800   NodeStatusUnknown   Kubelet stopped posting node status.
  Ready            Unknown   Tue, 17 Nov 2020 17:13:36 +0800   Tue, 17 Nov 2020 17:14:36 +0800   NodeStatusUnknown   Kubelet stopped posting node status.
Addresses:
  InternalIP:  10.200.18.101

从日志上可以看出node01的kubelet服务停止了,导致状态状态无法更新。