一、k8s高可用架构解析

Kubeadm高可用安装k8s集群_docker

二、Kubeadm基本环境配置

1、说明:

  • 我这边是用5台vmware workstation上的虚拟机安装的
  • 电脑名及IP地址规划:

k8s-master01:192.168.142.3

k8s-master02:192.168.142.4

k8s-master03:192.168.142.5

k8s-node01:192.168.142.6

k8s-node02:192.168.142.7

VIP(虚拟机IP):192.168.142.236

  • 系统版本:CentOS Linux release 7.9.2009 (Core)
  • 虚拟机配置:CPU=4核;内存=4G;硬盘=30G

2、基本配置

  • IP地址配置:略
  • 主机名配置:hostnamectl set-hostname k8s-master01(其他节点命令相似,用hostnamectl set-hostname
    主机名)
  • 所有节点(指虚拟机,下同)配置hosts(所有节点)
[root@master01 ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
# 添加了下面6个条目
192.168.142.3 k8s-master01
192.168.142.4 k8s-master02
192.168.142.5 k8s-master03
192.168.142.6 k8s-node01
192.168.142.7 k8s-node02
192.168.142.236 k8s-master-lb

3、配置yum源(所有节点)

# CentOS 7 yum源
mv /etc/yum.repos.d/CentOS-Base.repo /etc/yum.repos.d/CentOS-Base.repo.backup #备份原来的配置
wget -O /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-7.repo
yum makecache

# 配置docker源
yum install -y yum-utils device-mapper-persistent-data lvm2 #安装必要的一些系统工具
yum-config-manager --add-repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo #添加软件源信息
sed -i 's+download.docker.com+mirrors.aliyun.com/docker-ce+' /etc/yum.repos.d/docker-ce.repo
yum makecache fast #更新缓存
##注意:好像阿里云的docker-ce只支持到18,我用18也可以,可以根据具体情况换其他源

# 配置kubernetes源
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
#注意:由于官网未开放同步方式, 可能会有索引gpg检查失败的情况, 这时请用 yum install -y --nogpgcheck kubelet kubeadm kubectl 安装

4、关闭所有节点防火墙、selinuxe、dnsmasq、swap(所有节点)

#关闭防火墙
systemctl disable --now firewalld
systemctl disable --now dnsmasq
systemctl disable --now NetworkManager

#关闭selinux
setenforce 0
sed -i 's#SELINUX=enforcing#SELINUX=disabled#g' /etc/sysconfig/selinux
sed -i 's#SELINUX=enforcing#SELINUX=disabled#g' /etc/selinux/config

#关闭swap分区
swapoff -a && sysctl -w vm.swappiness=0
sed -ri '/^[^#]*swap/s@^@#@' /etc/fstab

5、​安装ntpdate,保证五台节点时间同步(所有节点):

rpm -ivh http://mirrors.wlnmp.com/centos/wlnmp-release-centos.noarch.rpm
yum install ntpdate -y
ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
echo 'Asia/Shanghai' >/etc/timezone
ntpdate time2.aliyun.com
# 查看当前时间
date
# 加入到crontab
crontab -e
*/5 * * * * /usr/sbin/ntpdate time2.aliyun.com

6、配置ulimit(所有节点)

ulimit -SHn 65535

#永久生效,在文件末尾添加
vim /etc/security/limits.conf
* soft nofile 655360
* hard nofile 131072
* soft nproc 655350
* hard nproc 655350
* soft memlock unlimited
* hard memlock unlimited

7、下载安装源码(所有节点,可下载一份后传到其他节点)

cd /root/ ; git clone https://github.com/dotbalo/k8s-ha-install.git

​8、更新并重启系统(所有节点)

yum update -y --exclude=kernel* && reboot

三、Kubeadm系统及内核升级

1、下载安装包(可在一台上下载,然后传到其他节点)

cd /root
wget http://193.49.22.109/elrepo/kernel/el7/x86_64/RPMS/kernel-ml-devel-4.19.12-1.el7.elrepo.x86_64.rpm
wget http://193.49.22.109/elrepo/kernel/el7/x86_64/RPMS/kernel-ml-4.19.12-1.el7.elrepo.x86_64.rpm

2、更改内核启动顺讯

grub2-set-default  0 && grub2-mkconfig -o /etc/grub2.cfg
grubby --args="user_namespace.enable=1" --update-kernel="$(grubby --default-kernel)"

3、检查默认内核版本是多少

grubby --default-kernel

4、重启所有节点,然后检查内核版本

uname -a

5、所有节点安装ipvsadm

yum install ipvsadm ipset sysstat conntrack libseccomp -y

6、所有节点配置ipvs模块

vim /etc/modules-load.d/ipvs.conf
# 加入以下内容
ip_vs
ip_vs_lc
ip_vs_wlc
ip_vs_rr
ip_vs_wrr
ip_vs_lblc
ip_vs_lblcr
ip_vs_dh
ip_vs_sh
ip_vs_fo
ip_vs_nq
ip_vs_sed
ip_vs_ftp
ip_vs_sh
nf_conntrack
ip_tables
ip_set
xt_set
ipt_set
ipt_rpfilter
ipt_REJECT
ipip

7、加载配置

systemctl enable --now systemd-modules-load.service

8、开启内核参数(所有节点)

cat <<EOF > /etc/sysctl.d/k8s.conf
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
fs.may_detach_mounts = 1
vm.overcommit_memory=1
vm.panic_on_oom=0
fs.inotify.max_user_watches=89100
fs.file-max=52706963
fs.nr_open=52706963
net.netfilter.nf_conntrack_max=2310720
net.ipv4.tcp_keepalive_time = 600
net.ipv4.tcp_keepalive_probes = 3
net.ipv4.tcp_keepalive_intvl =15
net.ipv4.tcp_max_tw_buckets = 36000
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_max_orphans = 327680
net.ipv4.tcp_orphan_retries = 3
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_max_syn_backlog = 16384
net.ipv4.ip_conntrack_max = 65536
net.ipv4.tcp_max_syn_backlog = 16384
net.ipv4.tcp_timestamps = 0
net.core.somaxconn = 16384
EOF
sysctl --system #让配置生效

9、重启服务器

10、检查是否加载

lsmod | grep --color=auto -e ip_vs -e nf_conntrack

四、Kubeadm基本组件安装

1、安装docker

yum install docker-ce-18.03.* docker-cli-18.03.* -y

2、新版kubelet建议使用systemd,所以可以把docker的CgroupDriver改成systemd

mkdir /etc/docker
cat > /etc/docker/daemon.json <<EOF
{
"exec-opts": ["native.cgroupdriver=systemd"]
}
EOF

3、启动docker及设置开机启动(所有节点)

systemctl daemon-reload && systemctl enable --now docker

4、查看docker版本

docker version

5、查看k8s版本

yum list kubeadm.x86_64 --showduplicates | sort -r

6、Docker是用yum安装的,docker的cgroup驱动程序默认设置为system。默认情况下Kubernetes cgroup为systemd,我们需要更改Docker cgroup驱动(这个可能是因为我使用了18版本导致的,后续会测试下其他版本会不会出现这个问题)

vim /etc/docker/daemon.json

{
"exec-opts": ["native.cgroupdriver=systemd"]
}

7、重启docker

systemctl restart docker

8、安装kubeadm(所有节点)

yum install kubeadm-1.20* kubelet-1.20* kubectl-1.20* -y

9、配置阿里云pause镜像

cat >/etc/sysconfig/kubelet<<EOF
KUBELET_EXTRA_ARGS="--cgroup-driver=systemd --pod-infra-container-image=registry.cn-hangzhou.aliyuncs.com/google_containers/pause-amd64:3.2"
EOF

10、设置kubelet开机启动

systemctl daemon-reload
systemctl enable --now kubelet

五、Kubeadm高可用组件安装

1、所有节点安装HAProxy和KeepAlived

yum install keepalived haproxy -y

2、所有master节点配置HAProxy

vim /etc/haproxy/haproxy.cfg
#清空内容,然后将下面内容复制进去
global
maxconn 2000
ulimit-n 16384
log 127.0.0.1 local0 err
stats timeout 30s

defaults
log global
mode http
option httplog
timeout connect 5000
timeout client 50000
timeout server 50000
timeout http-request 15s
timeout http-keep-alive 15s

frontend monitor-in
bind *:33305
mode http
option httplog
monitor-uri /monitor

frontend k8s-master
bind 0.0.0.0:16443
bind 127.0.0.1:16443
mode tcp
option tcplog
tcp-request inspect-delay 5s
default_backend k8s-master

backend k8s-master
mode tcp
option tcplog
option tcp-check
balance roundrobin
default-server inter 10s downinter 5s rise 2 fall 2 slowstart 60s maxconn 250 maxqueue 256 weight 100
server k8s-master01 192.168.142.3:6443 check
server k8s-master02 192.168.142.4:6443 check
server k8s-master03 192.168.142.5:6443 check

2、master节点配置keepalive(三个节点配置均不相同)

k8s-master01节点:

vim /etc/keepalived/keepalived.conf
#清空内容,然后将下面内容复制进去
! Configuration File for keepalived
global_defs {
router_id LVS_DEVEL
script_user root
enable_script_security
}
vrrp_script chk_apiserver {
script "/etc/keepalived/check_apiserver.sh"
interval 5
weight -5
fall 2
rise 1
}
vrrp_instance VI_1 {
state MASTER
interface ens32
mcast_src_ip 192.168.142.3
virtual_router_id 51
priority 101
advert_int 2
authentication {
auth_type PASS
auth_pass K8SHA_KA_AUTH
}
virtual_ipaddress {
192.168.142.236
}
track_script {
chk_apiserver
}
}

k8s-master02节点:

vim /etc/keepalived/keepalived.conf
#清空内容,然后将下面内容复制进去
! Configuration File for keepalived
global_defs {
router_id LVS_DEVEL
script_user root
enable_script_security
}
vrrp_script chk_apiserver {
script "/etc/keepalived/check_apiserver.sh"
interval 5
weight -5
fall 2
rise 1
}
vrrp_instance VI_1 {
state BACKUP
interface ens32
mcast_src_ip 192.168.142.4
virtual_router_id 51
priority 100
advert_int 2
authentication {
auth_type PASS
auth_pass K8SHA_KA_AUTH
}
virtual_ipaddress {
192.168.142.236
}
track_script {
chk_apiserver
}
}

k8s-master03节点:

vim /etc/keepalived/keepalived.conf
#清空内容,然后将下面内容复制进去
! Configuration File for keepalived
global_defs {
router_id LVS_DEVEL
script_user root
enable_script_security
}
vrrp_script chk_apiserver {
script "/etc/keepalived/check_apiserver.sh"
interval 5
weight -5
fall 2
rise 1
}
vrrp_instance VI_1 {
state BACKUP
interface ens32
mcast_src_ip 192.168.142.5
virtual_router_id 51
priority 100
advert_int 2
authentication {
auth_type PASS
auth_pass K8SHA_KA_AUTH
}
virtual_ipaddress {
192.168.142.236
}
track_script {
chk_apiserver
}
}

3、配置KeepAlived健康检查文件(所有master节点)

#!/bin/bash

err=0
for k in $(seq 1 3)
do
check_code=$(pgrep haproxy)
if [[ $check_code == "" ]]; then
err=$(expr $err + 1)
sleep 1
continue
else
err=0
break
fi
done

if [[ $err != "0" ]]; then
echo "systemctl stop keepalived"
/usr/bin/systemctl stop keepalived
exit 1
else
exit 0
fi

4、添加权限

chmod +x /etc/keepalived/check_apiserver.sh

5、启用HAProxy(所有master节点)

systemctl daemon-reload
systemctl enable --now haproxy

6、检查端口

netstat -lntp

7、启动keepalived

systemctl enable --now keepalived

8、查看系统日志

tail -f /var/log/messages
cat /var/log/messages | grep 'ens32' -5

9、查看k8s-master01 的ip地址

[root@master01 ~]# ip address
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens32: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:0c:29:f0:9d:a0 brd ff:ff:ff:ff:ff:ff
inet 192.168.142.3/24 brd 192.168.142.255 scope global ens32
valid_lft forever preferred_lft forever
inet 192.168.142.236/32 scope global ens32 #虚拟ip已经绑定过来了
valid_lft forever preferred_lft forever
inet6 fe80::20c:29ff:fef0:9da0/64 scope link
valid_lft forever preferred_lft forever

10、测试VIP

ping 192.168.142.236 -c 4
telnet 192.168.142.236 16443

以上两个都OK,说明虚拟IP没有问题

六、Kubeadm集群初始化

1、在k8s-master01上配置kubeadm-config.yaml文件

vim kubeadm-config.yaml
apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: 7t2weq.bjbawausm0jaxury
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 192.168.142.3
bindPort: 6443
nodeRegistration:
criSocket: /var/run/dockershim.sock
name: k8s-master01
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/master
---
apiServer:
certSANs:
- 192.168.142.236
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controlPlaneEndpoint: 192.168.142.236:16443
controllerManager: {}
dns:
type: CoreDNS
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: v1.20.0
networking:
dnsDomain: cluster.local
podSubnet: 172.168.0.0/12
serviceSubnet: 10.96.0.0/12
scheduler: {}

2、更新kubeadm文件

kubeadm config migrate --old-config kubeadm-config.yaml --new-config new.yaml

3、查看kubeadm版本,将配置文件中的 kubernetesVersion: v1.20.0 改为查看到的版本

kubeadm version

4、将new.yaml分发到其他master节点,之后所有Master节点提前下载镜像

kubeadm config images pull --config /root/new.yaml 

5、在k8s-master01上初始化,初始化后会生成证书和配置文件以及token值,可用于其他节点加入集群

kubeadm init --config /root/new.yaml  --upload-certs

6、如果初始化失败,可以重置后再次初始化

kubeadm reset -f ; ipvsadm --clear  ; rm -rf ~/.kube

7、​k8s-master01节点配置环境变量,用于访问kubernetes集群

cat <<EOF >> /root/.bashrc
export KUBECONFIG=/etc/kubernetes/admin.conf
EOF
source /root/.bashrc

8、查看节点状态

kubectl get nodes
kubectl get svc
kubectl get pods -n kube-system -o wide

七、高可用Master及Token过期处理

1、token过期后重新生成token

kubeadm token create --print-join-command

2、master要生成--certificate-key

kubeadm init phase upload-certs  --upload-certs

3、将节点加入为master

kubeadm join 192.168.142.236:16443 --token 7t2weq.bjbawausm0jaxury     --discovery-token-ca-cert-hash sha256:1823bba54f8204a28f8f1282929028f3e2c1e766ca28c9e15c9cdced62d553a8     --control-plane --certificate-key 32fcb75acf00413d25716e293dc3dfdca3d2a79972325f34ac585b61daa1b63d

4、将节点加入为node节点

kubeadm join 192.168.142.236:16443 --token 7t2weq.bjbawausm0jaxury --discovery-token-ca-cert-hash sha256:1823bba54f8204a28f8f1282929028f3e2c1e766ca28c9e15c9cdced62d553a8 --certificate-key 32fcb75acf00413d25716e293dc3dfdca3d2a79972325f34ac585b61daa1b63d

5、查看节点

kubectl get node

6、重新生成token

kubeadm token create --print-join-command
kubeadm init phase upload-certs --upload-certs

7、查看生成的token

kubectl get secret -n kube-system

8、查看token内容

kubectl get secret -n kube-system bootstrap-token-rff9me -oyaml

9、可以看到过期时间

expiration: MjAyMS0wNy0wOFQxNzo0MjoyMiswODowMA==
echo "MjAyMS0wNy0wOFQxNzo0MjoyMiswODowMA==" | base64 -d

八、Kubeadm Node及Calico节点配置

安装出错,有两个pod起不来,还在排查中

[root@master01 ~]# kubectl -n kube-system get pods
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-5f6d4b864b-jt9zx 1/1 Running 0 5h43m
calico-node-4mdf8 1/1 Running 3 5h43m
calico-node-drwvv 0/1 Pending 0 5h43m #这个有问题,待排查
calico-node-pqbks 1/1 Running 0 5h43m
calico-node-qfqs9 0/1 Pending 0 5h43m #这个有问题,待排查
calico-node-w5sw6 1/1 Running 0 5h43m
coredns-54d67798b7-gcswg 1/1 Running 0 8h
coredns-54d67798b7-vpx4k 1/1 Running 0 8h
etcd-k8s-master01 1/1 Running 5 8h
etcd-master02 1/1 Running 1 8h
etcd-master03 1/1 Running 1 8h
kube-apiserver-k8s-master01 1/1 Running 5 8h
kube-apiserver-master02 1/1 Running 1 8h
kube-apiserver-master03 1/1 Running 1 8h
kube-controller-manager-k8s-master01 1/1 Running 6 8h
kube-controller-manager-master02 1/1 Running 1 8h
kube-controller-manager-master03 1/1 Running 1 8h
kube-proxy-fhvfk 1/1 Running 0 5h5m
kube-proxy-j59mt 1/1 Running 1 8h
kube-proxy-qvjj5 1/1 Terminating 1 8h
kube-proxy-vrsjf 1/1 Running 1 8h
kube-proxy-zp9vl 1/1 Running 1 8h
kube-scheduler-k8s-master01 1/1 Running 6 8h
kube-scheduler-master02 1/1 Running 1 8h
kube-scheduler-master03 1/1 Running 1 8h
metrics-server-545b8b99c6-7l5rz 1/1 Running 0 5h20m
[root@master01 ~]#

九、Dashboard&Metrics Server安装