一、k8s高可用架构解析
二、Kubeadm基本环境配置
1、说明:
- 我这边是用5台vmware workstation上的虚拟机安装的
- 电脑名及IP地址规划:
k8s-master01:192.168.142.3
k8s-master02:192.168.142.4
k8s-master03:192.168.142.5
k8s-node01:192.168.142.6
k8s-node02:192.168.142.7
VIP(虚拟机IP):192.168.142.236
- 系统版本:CentOS Linux release 7.9.2009 (Core)
- 虚拟机配置:CPU=4核;内存=4G;硬盘=30G
2、基本配置
- IP地址配置:略
- 主机名配置:hostnamectl set-hostname k8s-master01(其他节点命令相似,用hostnamectl set-hostname
主机名) - 所有节点(指虚拟机,下同)配置hosts(所有节点)
[root@master01 ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
# 添加了下面6个条目
192.168.142.3 k8s-master01
192.168.142.4 k8s-master02
192.168.142.5 k8s-master03
192.168.142.6 k8s-node01
192.168.142.7 k8s-node02
192.168.142.236 k8s-master-lb
3、配置yum源(所有节点)
# CentOS 7 yum源
mv /etc/yum.repos.d/CentOS-Base.repo /etc/yum.repos.d/CentOS-Base.repo.backup #备份原来的配置
wget -O /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-7.repo
yum makecache
# 配置docker源
yum install -y yum-utils device-mapper-persistent-data lvm2 #安装必要的一些系统工具
yum-config-manager --add-repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo #添加软件源信息
sed -i 's+download.docker.com+mirrors.aliyun.com/docker-ce+' /etc/yum.repos.d/docker-ce.repo
yum makecache fast #更新缓存
##注意:好像阿里云的docker-ce只支持到18,我用18也可以,可以根据具体情况换其他源
# 配置kubernetes源
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
#注意:由于官网未开放同步方式, 可能会有索引gpg检查失败的情况, 这时请用 yum install -y --nogpgcheck kubelet kubeadm kubectl 安装
4、关闭所有节点防火墙、selinuxe、dnsmasq、swap(所有节点)
#关闭防火墙
systemctl disable --now firewalld
systemctl disable --now dnsmasq
systemctl disable --now NetworkManager
#关闭selinux
setenforce 0
sed -i 's#SELINUX=enforcing#SELINUX=disabled#g' /etc/sysconfig/selinux
sed -i 's#SELINUX=enforcing#SELINUX=disabled#g' /etc/selinux/config
#关闭swap分区
swapoff -a && sysctl -w vm.swappiness=0
sed -ri '/^[^#]*swap/s@^@#@' /etc/fstab
5、安装ntpdate,保证五台节点时间同步(所有节点):
rpm -ivh http://mirrors.wlnmp.com/centos/wlnmp-release-centos.noarch.rpm
yum install ntpdate -y
ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
echo 'Asia/Shanghai' >/etc/timezone
ntpdate time2.aliyun.com
# 查看当前时间
date
# 加入到crontab
crontab -e
*/5 * * * * /usr/sbin/ntpdate time2.aliyun.com
6、配置ulimit(所有节点)
ulimit -SHn 65535
#永久生效,在文件末尾添加
vim /etc/security/limits.conf
* soft nofile 655360
* hard nofile 131072
* soft nproc 655350
* hard nproc 655350
* soft memlock unlimited
* hard memlock unlimited
7、下载安装源码(所有节点,可下载一份后传到其他节点)
cd /root/ ; git clone https://github.com/dotbalo/k8s-ha-install.git
8、更新并重启系统(所有节点)
yum update -y --exclude=kernel* && reboot
三、Kubeadm系统及内核升级
1、下载安装包(可在一台上下载,然后传到其他节点)
cd /root
wget http://193.49.22.109/elrepo/kernel/el7/x86_64/RPMS/kernel-ml-devel-4.19.12-1.el7.elrepo.x86_64.rpm
wget http://193.49.22.109/elrepo/kernel/el7/x86_64/RPMS/kernel-ml-4.19.12-1.el7.elrepo.x86_64.rpm
2、更改内核启动顺讯
grub2-set-default 0 && grub2-mkconfig -o /etc/grub2.cfg
grubby --args="user_namespace.enable=1" --update-kernel="$(grubby --default-kernel)"
3、检查默认内核版本是多少
grubby --default-kernel
4、重启所有节点,然后检查内核版本
uname -a
5、所有节点安装ipvsadm
yum install ipvsadm ipset sysstat conntrack libseccomp -y
6、所有节点配置ipvs模块
vim /etc/modules-load.d/ipvs.conf
# 加入以下内容
ip_vs
ip_vs_lc
ip_vs_wlc
ip_vs_rr
ip_vs_wrr
ip_vs_lblc
ip_vs_lblcr
ip_vs_dh
ip_vs_sh
ip_vs_fo
ip_vs_nq
ip_vs_sed
ip_vs_ftp
ip_vs_sh
nf_conntrack
ip_tables
ip_set
xt_set
ipt_set
ipt_rpfilter
ipt_REJECT
ipip
7、加载配置
systemctl enable --now systemd-modules-load.service
8、开启内核参数(所有节点)
cat <<EOF > /etc/sysctl.d/k8s.conf
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
fs.may_detach_mounts = 1
vm.overcommit_memory=1
vm.panic_on_oom=0
fs.inotify.max_user_watches=89100
fs.file-max=52706963
fs.nr_open=52706963
net.netfilter.nf_conntrack_max=2310720
net.ipv4.tcp_keepalive_time = 600
net.ipv4.tcp_keepalive_probes = 3
net.ipv4.tcp_keepalive_intvl =15
net.ipv4.tcp_max_tw_buckets = 36000
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_max_orphans = 327680
net.ipv4.tcp_orphan_retries = 3
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_max_syn_backlog = 16384
net.ipv4.ip_conntrack_max = 65536
net.ipv4.tcp_max_syn_backlog = 16384
net.ipv4.tcp_timestamps = 0
net.core.somaxconn = 16384
EOF
sysctl --system #让配置生效
9、重启服务器
10、检查是否加载
lsmod | grep --color=auto -e ip_vs -e nf_conntrack
四、Kubeadm基本组件安装
1、安装docker
yum install docker-ce-18.03.* docker-cli-18.03.* -y
2、新版kubelet建议使用systemd,所以可以把docker的CgroupDriver改成systemd
mkdir /etc/docker
cat > /etc/docker/daemon.json <<EOF
{
"exec-opts": ["native.cgroupdriver=systemd"]
}
EOF
3、启动docker及设置开机启动(所有节点)
systemctl daemon-reload && systemctl enable --now docker
4、查看docker版本
docker version
5、查看k8s版本
yum list kubeadm.x86_64 --showduplicates | sort -r
6、Docker是用yum安装的,docker的cgroup驱动程序默认设置为system。默认情况下Kubernetes cgroup为systemd,我们需要更改Docker cgroup驱动(这个可能是因为我使用了18版本导致的,后续会测试下其他版本会不会出现这个问题)
vim /etc/docker/daemon.json
{
"exec-opts": ["native.cgroupdriver=systemd"]
}
7、重启docker
systemctl restart docker
8、安装kubeadm(所有节点)
yum install kubeadm-1.20* kubelet-1.20* kubectl-1.20* -y
9、配置阿里云pause镜像
cat >/etc/sysconfig/kubelet<<EOF
KUBELET_EXTRA_ARGS="--cgroup-driver=systemd --pod-infra-container-image=registry.cn-hangzhou.aliyuncs.com/google_containers/pause-amd64:3.2"
EOF
10、设置kubelet开机启动
systemctl daemon-reload
systemctl enable --now kubelet
五、Kubeadm高可用组件安装
1、所有节点安装HAProxy和KeepAlived
yum install keepalived haproxy -y
2、所有master节点配置HAProxy
vim /etc/haproxy/haproxy.cfg
#清空内容,然后将下面内容复制进去
global
maxconn 2000
ulimit-n 16384
log 127.0.0.1 local0 err
stats timeout 30s
defaults
log global
mode http
option httplog
timeout connect 5000
timeout client 50000
timeout server 50000
timeout http-request 15s
timeout http-keep-alive 15s
frontend monitor-in
bind *:33305
mode http
option httplog
monitor-uri /monitor
frontend k8s-master
bind 0.0.0.0:16443
bind 127.0.0.1:16443
mode tcp
option tcplog
tcp-request inspect-delay 5s
default_backend k8s-master
backend k8s-master
mode tcp
option tcplog
option tcp-check
balance roundrobin
default-server inter 10s downinter 5s rise 2 fall 2 slowstart 60s maxconn 250 maxqueue 256 weight 100
server k8s-master01 192.168.142.3:6443 check
server k8s-master02 192.168.142.4:6443 check
server k8s-master03 192.168.142.5:6443 check
2、master节点配置keepalive(三个节点配置均不相同)
k8s-master01节点:
vim /etc/keepalived/keepalived.conf
#清空内容,然后将下面内容复制进去
! Configuration File for keepalived
global_defs {
router_id LVS_DEVEL
script_user root
enable_script_security
}
vrrp_script chk_apiserver {
script "/etc/keepalived/check_apiserver.sh"
interval 5
weight -5
fall 2
rise 1
}
vrrp_instance VI_1 {
state MASTER
interface ens32
mcast_src_ip 192.168.142.3
virtual_router_id 51
priority 101
advert_int 2
authentication {
auth_type PASS
auth_pass K8SHA_KA_AUTH
}
virtual_ipaddress {
192.168.142.236
}
track_script {
chk_apiserver
}
}
k8s-master02节点:
vim /etc/keepalived/keepalived.conf
#清空内容,然后将下面内容复制进去
! Configuration File for keepalived
global_defs {
router_id LVS_DEVEL
script_user root
enable_script_security
}
vrrp_script chk_apiserver {
script "/etc/keepalived/check_apiserver.sh"
interval 5
weight -5
fall 2
rise 1
}
vrrp_instance VI_1 {
state BACKUP
interface ens32
mcast_src_ip 192.168.142.4
virtual_router_id 51
priority 100
advert_int 2
authentication {
auth_type PASS
auth_pass K8SHA_KA_AUTH
}
virtual_ipaddress {
192.168.142.236
}
track_script {
chk_apiserver
}
}
k8s-master03节点:
vim /etc/keepalived/keepalived.conf
#清空内容,然后将下面内容复制进去
! Configuration File for keepalived
global_defs {
router_id LVS_DEVEL
script_user root
enable_script_security
}
vrrp_script chk_apiserver {
script "/etc/keepalived/check_apiserver.sh"
interval 5
weight -5
fall 2
rise 1
}
vrrp_instance VI_1 {
state BACKUP
interface ens32
mcast_src_ip 192.168.142.5
virtual_router_id 51
priority 100
advert_int 2
authentication {
auth_type PASS
auth_pass K8SHA_KA_AUTH
}
virtual_ipaddress {
192.168.142.236
}
track_script {
chk_apiserver
}
}
3、配置KeepAlived健康检查文件(所有master节点)
#!/bin/bash
err=0
for k in $(seq 1 3)
do
check_code=$(pgrep haproxy)
if [[ $check_code == "" ]]; then
err=$(expr $err + 1)
sleep 1
continue
else
err=0
break
fi
done
if [[ $err != "0" ]]; then
echo "systemctl stop keepalived"
/usr/bin/systemctl stop keepalived
exit 1
else
exit 0
fi
4、添加权限
chmod +x /etc/keepalived/check_apiserver.sh
5、启用HAProxy(所有master节点)
systemctl daemon-reload
systemctl enable --now haproxy
6、检查端口
netstat -lntp
7、启动keepalived
systemctl enable --now keepalived
8、查看系统日志
tail -f /var/log/messages
cat /var/log/messages | grep 'ens32' -5
9、查看k8s-master01 的ip地址
[root@master01 ~]# ip address
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens32: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:0c:29:f0:9d:a0 brd ff:ff:ff:ff:ff:ff
inet 192.168.142.3/24 brd 192.168.142.255 scope global ens32
valid_lft forever preferred_lft forever
inet 192.168.142.236/32 scope global ens32 #虚拟ip已经绑定过来了
valid_lft forever preferred_lft forever
inet6 fe80::20c:29ff:fef0:9da0/64 scope link
valid_lft forever preferred_lft forever
10、测试VIP
ping 192.168.142.236 -c 4
telnet 192.168.142.236 16443
以上两个都OK,说明虚拟IP没有问题
六、Kubeadm集群初始化
1、在k8s-master01上配置kubeadm-config.yaml文件
vim kubeadm-config.yaml
apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: 7t2weq.bjbawausm0jaxury
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 192.168.142.3
bindPort: 6443
nodeRegistration:
criSocket: /var/run/dockershim.sock
name: k8s-master01
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/master
---
apiServer:
certSANs:
- 192.168.142.236
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controlPlaneEndpoint: 192.168.142.236:16443
controllerManager: {}
dns:
type: CoreDNS
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: v1.20.0
networking:
dnsDomain: cluster.local
podSubnet: 172.168.0.0/12
serviceSubnet: 10.96.0.0/12
scheduler: {}
2、更新kubeadm文件
kubeadm config migrate --old-config kubeadm-config.yaml --new-config new.yaml
3、查看kubeadm版本,将配置文件中的 kubernetesVersion: v1.20.0 改为查看到的版本
kubeadm version
4、将new.yaml分发到其他master节点,之后所有Master节点提前下载镜像
kubeadm config images pull --config /root/new.yaml
5、在k8s-master01上初始化,初始化后会生成证书和配置文件以及token值,可用于其他节点加入集群
kubeadm init --config /root/new.yaml --upload-certs
6、如果初始化失败,可以重置后再次初始化
kubeadm reset -f ; ipvsadm --clear ; rm -rf ~/.kube
7、k8s-master01节点配置环境变量,用于访问kubernetes集群
cat <<EOF >> /root/.bashrc
export KUBECONFIG=/etc/kubernetes/admin.conf
EOF
source /root/.bashrc
8、查看节点状态
kubectl get nodes
kubectl get svc
kubectl get pods -n kube-system -o wide
七、高可用Master及Token过期处理
1、token过期后重新生成token
kubeadm token create --print-join-command
2、master要生成--certificate-key
kubeadm init phase upload-certs --upload-certs
3、将节点加入为master
kubeadm join 192.168.142.236:16443 --token 7t2weq.bjbawausm0jaxury --discovery-token-ca-cert-hash sha256:1823bba54f8204a28f8f1282929028f3e2c1e766ca28c9e15c9cdced62d553a8 --control-plane --certificate-key 32fcb75acf00413d25716e293dc3dfdca3d2a79972325f34ac585b61daa1b63d
4、将节点加入为node节点
kubeadm join 192.168.142.236:16443 --token 7t2weq.bjbawausm0jaxury --discovery-token-ca-cert-hash sha256:1823bba54f8204a28f8f1282929028f3e2c1e766ca28c9e15c9cdced62d553a8 --certificate-key 32fcb75acf00413d25716e293dc3dfdca3d2a79972325f34ac585b61daa1b63d
5、查看节点
kubectl get node
6、重新生成token
kubeadm token create --print-join-command
kubeadm init phase upload-certs --upload-certs
7、查看生成的token
kubectl get secret -n kube-system
8、查看token内容
kubectl get secret -n kube-system bootstrap-token-rff9me -oyaml
9、可以看到过期时间
expiration: MjAyMS0wNy0wOFQxNzo0MjoyMiswODowMA==
echo "MjAyMS0wNy0wOFQxNzo0MjoyMiswODowMA==" | base64 -d
八、Kubeadm Node及Calico节点配置
安装出错,有两个pod起不来,还在排查中
[root@master01 ~]# kubectl -n kube-system get pods
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-5f6d4b864b-jt9zx 1/1 Running 0 5h43m
calico-node-4mdf8 1/1 Running 3 5h43m
calico-node-drwvv 0/1 Pending 0 5h43m #这个有问题,待排查
calico-node-pqbks 1/1 Running 0 5h43m
calico-node-qfqs9 0/1 Pending 0 5h43m #这个有问题,待排查
calico-node-w5sw6 1/1 Running 0 5h43m
coredns-54d67798b7-gcswg 1/1 Running 0 8h
coredns-54d67798b7-vpx4k 1/1 Running 0 8h
etcd-k8s-master01 1/1 Running 5 8h
etcd-master02 1/1 Running 1 8h
etcd-master03 1/1 Running 1 8h
kube-apiserver-k8s-master01 1/1 Running 5 8h
kube-apiserver-master02 1/1 Running 1 8h
kube-apiserver-master03 1/1 Running 1 8h
kube-controller-manager-k8s-master01 1/1 Running 6 8h
kube-controller-manager-master02 1/1 Running 1 8h
kube-controller-manager-master03 1/1 Running 1 8h
kube-proxy-fhvfk 1/1 Running 0 5h5m
kube-proxy-j59mt 1/1 Running 1 8h
kube-proxy-qvjj5 1/1 Terminating 1 8h
kube-proxy-vrsjf 1/1 Running 1 8h
kube-proxy-zp9vl 1/1 Running 1 8h
kube-scheduler-k8s-master01 1/1 Running 6 8h
kube-scheduler-master02 1/1 Running 1 8h
kube-scheduler-master03 1/1 Running 1 8h
metrics-server-545b8b99c6-7l5rz 1/1 Running 0 5h20m
[root@master01 ~]#
九、Dashboard&Metrics Server安装