kubeasz安装多主k8s集群

参考官方文档——>kubeasz安装版本地址

环境准备

测试服务器配置: 四台centos7.6(2u4g), 网络通畅就ok,用户名及密码统一。

IP

主机名

角色

192.168.122.11

master1

etcd, master1,ansible的管理节点

192.168.122.12

master2

etcd, master2

192.168.122.13

node1

etcd, node1

192.168.122.14

node2

node2

版本

两个版本概念

安装版本:kubeasz作者编写安装工具版本, 脚本中变量 export release=3.1.0

k8s集群版本:官方发行的k8s集群版本,脚本中变量 export k8s_ver=v1.21.0

安装版本

k8s 集群内pod 连接集群外MySQL k8s 多集群_ico

集群版本

k8s 集群内pod 连接集群外MySQL k8s 多集群_kubernetes_02

选择版本

k8s 集群内pod 连接集群外MySQL k8s 多集群_kubernetes_03

Script

建议:脚本在 master1 执行

[root@master ~]# vim install

#!/bin/bash
# auther: long   Notes: Original author "Boge", later modified by "dragon".
# descriptions:  the shell scripts will use ansible to deploy K8S at binary for siample

# 传参检测
[ $# -ne 6 ] && echo -e "Usage: $0 rootpasswd netnum nethosts cri cni k8s-cluster-name\nExample: bash $0 bogedevops 10.0.1 201\ 202\ 203\ 204 [containerd|docker] [calico|flannel] test\n" && exit 11

# 变量定义
export release=3.1.0    # 2.2.4, 2.2.3, 3.0.0, 3.2.0               # 作者的安装版本
export k8s_ver=v1.21.0  # v1.20.2, v1.19.7, v1.18.15, v1.17.17     # 安装版本中集群版本
rootpasswd=$1
netnum=$2
nethosts=$3
cri=$4
cni=$5
clustername=$6
if ls -1v ./kubeasz*.tar.gz &>/dev/null;then software_packet="$(ls -1v ./kubeasz*.tar.gz )";else software_packet="";fi
pwd="/etc/kubeasz"

# deploy机器安装相应软件包
if cat /etc/redhat-release &>/dev/null;then
    yum install git python3-pip sshpass -y && pip3 install --upgrade pip && pip install ipython && pip3 install ansible==2.6.18 netaddr==0.7.19 -i https://mirrors.aliyun.com/pypi/simple/
    [ ansible --version >/dev/null ]
else
    apt install git python3-pip sshpass -y && pip3 install --upgrade pip && pip install ipython && pip3 install ansible==2.6.18 netaddr==0.7.19 -i https://mirrors.aliyun.com/pypi/simple/
    [ ansible --version >/dev/null ]
fi

# 在deploy机器做其他node的ssh免密操作
for host in `echo "${nethosts}"`
do
    echo "============ ${netnum}.${host} ===========";

    if [[ ${USER} == 'root' ]];then
        [ ! -f /${USER}/.ssh/id_rsa ] &&\
        ssh-keygen -t rsa -P '' -f /${USER}/.ssh/id_rsa
    else
        [ ! -f /home/${USER}/.ssh/id_rsa ] &&\
        ssh-keygen -t rsa -P '' -f /home/${USER}/.ssh/id_rsa
    fi
    sshpass -p ${rootpasswd} ssh-copy-id -o StrictHostKeyChecking=no ${USER}@${netnum}.${host}

    if cat /etc/redhat-release &>/dev/null;then
        ssh -o StrictHostKeyChecking=no ${USER}@${netnum}.${host} "yum update -y"
    else
        ssh -o StrictHostKeyChecking=no ${USER}@${netnum}.${host} "apt-get update && apt install python -y"
    fi
done


# deploy机器下载k8s二进制安装脚本

if [[ ${software_packet} == '' ]];then
    curl -C- -fLO --retry 3 https://github.com/easzlab/kubeasz/releases/download/${release}/ezdown
    sed -ri "s+^(K8S_BIN_VER=).*$+\1${k8s_ver}+g" ezdown
    chmod +x ./ezdown
    # 使用工具脚本下载
    ./ezdown -D && ./ezdown -P
else
    tar xvf ${software_packet} -C /etc/
    chmod +x ${pwd}/{ezctl,ezdown}
fi

# 初始化一个名为my的k8s集群配置

CLUSTER_NAME="$clustername"
${pwd}/ezctl new ${CLUSTER_NAME}
if [[ $? -ne 0 ]];then
    echo "cluster name [${CLUSTER_NAME}] was exist in ${pwd}/clusters/${CLUSTER_NAME}."
    exit 1
fi

if [[ ${software_packet} != '' ]];then
    # 设置参数,启用离线安装
    sed -i 's/^INSTALL_SOURCE.*$/INSTALL_SOURCE: "offline"/g' ${pwd}/clusters/${CLUSTER_NAME}/config.yml
fi


# to check ansible service
ansible all -m ping

#---------------------------------------------------------------------------------------------------




#修改二进制安装脚本配置 config.yml

sed -ri "s+^(CLUSTER_NAME:).*$+\1 \"${CLUSTER_NAME}\"+g" ${pwd}/clusters/${CLUSTER_NAME}/config.yml

## k8s上日志及容器数据存独立磁盘步骤(参考阿里云的)

[ ! -d /var/lib/container ] && mkdir -p /var/lib/container/{kubelet,docker}

## cat /etc/fstab
# UUID=105fa8ff-bacd-491f-a6d0-f99865afc3d6 /                       ext4    defaults        1 1
# /dev/vdb /var/lib/container/ ext4 defaults 0 0
# /var/lib/container/kubelet /var/lib/kubelet none defaults,bind 0 0
# /var/lib/container/docker /var/lib/docker none defaults,bind 0 0

## tree -L 1 /var/lib/container
# /var/lib/container
# ├── docker
# ├── kubelet
# └── lost+found

# docker data dir
DOCKER_STORAGE_DIR="/var/lib/container/docker"
sed -ri "s+^(DOCKER_STORAGE_DIR:).*$+DOCKER_STORAGE_DIR: \"${DOCKER_STORAGE_DIR}\"+g" ${pwd}/clusters/${CLUSTER_NAME}/config.yml
# containerd data dir
CONTAINERD_STORAGE_DIR="/var/lib/container/containerd"
sed -ri "s+^(CONTAINERD_STORAGE_DIR:).*$+CONTAINERD_STORAGE_DIR: \"${CONTAINERD_STORAGE_DIR}\"+g" ${pwd}/clusters/${CLUSTER_NAME}/config.yml
# kubelet logs dir
KUBELET_ROOT_DIR="/var/lib/container/kubelet"
sed -ri "s+^(KUBELET_ROOT_DIR:).*$+KUBELET_ROOT_DIR: \"${KUBELET_ROOT_DIR}\"+g" ${pwd}/clusters/${CLUSTER_NAME}/config.yml
if [[ $clustername != 'aws' ]]; then
    # docker aliyun repo
    REG_MIRRORS="https://pqbap4ya.mirror.aliyuncs.com"
    sed -ri "s+^REG_MIRRORS:.*$+REG_MIRRORS: \'[\"${REG_MIRRORS}\"]\'+g" ${pwd}/clusters/${CLUSTER_NAME}/config.yml
fi
# [docker]信任的HTTP仓库
sed -ri "s+127.0.0.1/8+${netnum}.0/24+g" ${pwd}/clusters/${CLUSTER_NAME}/config.yml
## disable dashboard auto install
#sed -ri "s+^(dashboard_install:).*$+\1 \"no\"+g" ${pwd}/clusters/${CLUSTER_NAME}/config.yml

# 融合配置准备
CLUSEER_WEBSITE="${CLUSTER_NAME}k8s.gtapp.xyz"
lb_num=$(grep -wn '^MASTER_CERT_HOSTS:' ${pwd}/clusters/${CLUSTER_NAME}/config.yml |awk -F: '{print $1}')
lb_num1=$(expr ${lb_num} + 1)
lb_num2=$(expr ${lb_num} + 2)
sed -ri "${lb_num1}s+.*$+  - "${CLUSEER_WEBSITE}"+g" ${pwd}/clusters/${CLUSTER_NAME}/config.yml
sed -ri "${lb_num2}s+(.*)$+#\1+g" ${pwd}/clusters/${CLUSTER_NAME}/config.yml

# node节点最大pod 数
MAX_PODS="120"
sed -ri "s+^(MAX_PODS:).*$+\1 ${MAX_PODS}+g" ${pwd}/clusters/${CLUSTER_NAME}/config.yml



# 修改二进制安装脚本配置 hosts
# clean old ip
sed -ri '/192.168.1.1/d' ${pwd}/clusters/${CLUSTER_NAME}/hosts
sed -ri '/192.168.1.2/d' ${pwd}/clusters/${CLUSTER_NAME}/hosts
sed -ri '/192.168.1.3/d' ${pwd}/clusters/${CLUSTER_NAME}/hosts
sed -ri '/192.168.1.4/d' ${pwd}/clusters/${CLUSTER_NAME}/hosts

# 输入准备创建ETCD集群的主机位
echo "enter etcd hosts here (example: 203 202 201) ↓"
read -p "" ipnums
for ipnum in `echo ${ipnums}`
do
    echo $netnum.$ipnum
    sed -i "/\[etcd/a $netnum.$ipnum"  ${pwd}/clusters/${CLUSTER_NAME}/hosts
done

# 输入准备创建KUBE-MASTER集群的主机位
echo "enter kube-master hosts here (example: 202 201) ↓"
read -p "" ipnums
for ipnum in `echo ${ipnums}`
do
    echo $netnum.$ipnum
    sed -i "/\[kube_master/a $netnum.$ipnum"  ${pwd}/clusters/${CLUSTER_NAME}/hosts
done

# 输入准备创建KUBE-NODE集群的主机位
echo "enter kube-node hosts here (example: 204 203) ↓"
read -p "" ipnums
for ipnum in `echo ${ipnums}`
do
    echo $netnum.$ipnum
    sed -i "/\[kube_node/a $netnum.$ipnum"  ${pwd}/clusters/${CLUSTER_NAME}/hosts
done

# 配置容器运行时CNI
case ${cni} in
    flannel)
    sed -ri "s+^CLUSTER_NETWORK=.*$+CLUSTER_NETWORK=\"${cni}\"+g" ${pwd}/clusters/${CLUSTER_NAME}/hosts
    ;;
    calico)
    sed -ri "s+^CLUSTER_NETWORK=.*$+CLUSTER_NETWORK=\"${cni}\"+g" ${pwd}/clusters/${CLUSTER_NAME}/hosts
    ;;
    *)
    echo "cni need be flannel or calico."
    exit 11
esac

# 配置K8S的ETCD数据备份的定时任务
if cat /etc/redhat-release &>/dev/null;then
    if ! grep -w '94.backup.yml' /var/spool/cron/root &>/dev/null;then echo "00 00 * * * `which ansible-playbook` ${pwd}/playbooks/94.backup.yml &> /dev/null" >> /var/spool/cron/root;else echo exists ;fi
    chown root.crontab /var/spool/cron/root
    chmod 600 /var/spool/cron/root
else
    if ! grep -w '94.backup.yml' /var/spool/cron/crontabs/root &>/dev/null;then echo "00 00 * * * `which ansible-playbook` ${pwd}/playbooks/94.backup.yml &> /dev/null" >> /var/spool/cron/crontabs/root;else echo exists ;fi
    chown root.crontab /var/spool/cron/crontabs/root
    chmod 600 /var/spool/cron/crontabs/root
fi
rm /var/run/cron.reboot
service crond restart




#---------------------------------------------------------------------------------------------------
# 准备开始安装了
rm -rf ${pwd}/{dockerfiles,docs,.gitignore,pics,dockerfiles} &&\
find ${pwd}/ -name '*.md'|xargs rm -f
read -p "Enter to continue deploy k8s to all nodes >>>" YesNobbb

# now start deploy k8s cluster
cd ${pwd}/

# to prepare CA/certs & kubeconfig & other system settings
${pwd}/ezctl setup ${CLUSTER_NAME} 01
sleep 1
# to setup the etcd cluster
${pwd}/ezctl setup ${CLUSTER_NAME} 02
sleep 1
# to setup the container runtime(docker or containerd)
case ${cri} in
    containerd)
    sed -ri "s+^CONTAINER_RUNTIME=.*$+CONTAINER_RUNTIME=\"${cri}\"+g" ${pwd}/clusters/${CLUSTER_NAME}/hosts
    ${pwd}/ezctl setup ${CLUSTER_NAME} 03
    ;;
    docker)
    sed -ri "s+^CONTAINER_RUNTIME=.*$+CONTAINER_RUNTIME=\"${cri}\"+g" ${pwd}/clusters/${CLUSTER_NAME}/hosts
    ${pwd}/ezctl setup ${CLUSTER_NAME} 03
    ;;
    *)
    echo "cri need be containerd or docker."
    exit 11
esac
sleep 1
# to setup the master nodes
${pwd}/ezctl setup ${CLUSTER_NAME} 04
sleep 1
# to setup the worker nodes
${pwd}/ezctl setup ${CLUSTER_NAME} 05
sleep 1
# to setup the network plugin(flannel、calico...)
${pwd}/ezctl setup ${CLUSTER_NAME} 06
sleep 1
# to setup other useful plugins(metrics-server、coredns...)
${pwd}/ezctl setup ${CLUSTER_NAME} 07
sleep 1
# [可选]对集群所有节点进行操作系统层面的安全加固  https://github.com/dev-sec/ansible-os-hardening
#ansible-playbook roles/os-harden/os-harden.yml
#sleep 1
cd `dirname ${software_packet:-/tmp}`


k8s_bin_path='/opt/kube/bin'


echo "-------------------------  k8s version list  ---------------------------"
${k8s_bin_path}/kubectl version
echo
echo "-------------------------  All Healthy status check  -------------------"
${k8s_bin_path}/kubectl get componentstatus
echo
echo "-------------------------  k8s cluster info list  ----------------------"
${k8s_bin_path}/kubectl cluster-info
echo
echo "-------------------------  k8s all nodes list  -------------------------"
${k8s_bin_path}/kubectl get node -o wide
echo
echo "-------------------------  k8s all-namespaces's pods list   ------------"
${k8s_bin_path}/kubectl get pod --all-namespaces
echo
echo "-------------------------  k8s all-namespaces's service network   ------"
${k8s_bin_path}/kubectl get svc --all-namespaces
echo
echo "-------------------------  k8s welcome for you   -----------------------"
echo

# you can use k alias kubectl to siample
echo "alias k=kubectl && complete -F __start_kubectl k" >> ~/.bashrc

# get dashboard url
${k8s_bin_path}/kubectl cluster-info|grep dashboard|awk '{print $NF}'|tee -a /root/k8s_results

# get login token
${k8s_bin_path}/kubectl -n kube-system describe secret $(${k8s_bin_path}/kubectl -n kube-system get secret | grep admin-user | awk '{print $1}')|grep 'token:'|awk '{print $NF}'|tee -a /root/k8s_results
echo
echo "you can look again dashboard and token info at  >>> /root/k8s_results <<<"
#echo ">>>>>>>>>>>>>>>>> You can excute command [ source ~/.bashrc ] <<<<<<<<<<<<<<<<<<<<"
echo ">>>>>>>>>>>>>>>>> You need to excute command [ reboot ] to restart all nodes <<<<<<<<<<<<<<<<<<<<"
rm -f $0
[ -f ${software_packet} ] && rm -f ${software_packet}
#rm -f ${pwd}/roles/deploy/templates/${USER_NAME}-csr.json.j2
#sed -ri "s+${USER_NAME}+admin+g" ${pwd}/roles/prepare/tasks/main.yml

Run Script

脚本执行语法(脚本名称 + 服务器密码 + ip网络位 + ip主机位 + 选择容器运行时 + CNI + 集群名称)

[root@master ~]# sh install 123456 10.1.1 11\ 12\ 13\ 14\ 15 docker calico long

    脚本名称: 随意
    服务器密码:所有服务器密码统一
    ip网络位: 参考案例方法
    ip主机位: 参考案例方法 (主机号+\, 以此类推)
    选择容器运行时:[containerd|docker] 
    CNI:[calico|flannel] 
    集群名称:更加业务名称(随意)
    
# 脚本基本是自动化的,除了下面几处提示按要求复制粘贴下,再回车即可

# 输入准备创建ETCD集群的主机位,复制  203 202 201 粘贴并回车
echo "enter etcd hosts here (example: 203 202 201) ↓"

# 输入准备创建KUBE-MASTER集群的主机位,复制  202 201 粘贴并回车
echo "enter kube-master hosts here (example: 202 201) ↓"

# 输入准备创建KUBE-NODE集群的主机位,复制  204 203 粘贴并回车
echo "enter kube-node hosts here (example: 204 203) ↓"

# 这里会提示你是否继续安装,没问题的话直接回车即可
Enter to continue deploy k8s to all nodes >>>

# 安装完成后重新加载下环境变量以实现kubectl命令补齐
[root@master ~]#. ~/.bashrc 

# 安装完成后重启
[root@master ~]# reboot         #重启整个kubernetes集群,不是master.

集群验证

[root@master ~]# kubectl get nodes

dashboard登录

查看nodeport端口(我这里为24949)

[root@master1 ansible]# kubectl get svc -n kube-system|grep kubernetes-dashboard
kubernetes-dashboard        NodePort    10.68.242.33   <none>        443:24949/TCP            38m

通过浏览器访问(IP为master的IP,端口为上面查询的端口,https协议)

k8s 集群内pod 连接集群外MySQL k8s 多集群_ico_04


k8s 集群内pod 连接集群外MySQL k8s 多集群_sed_05


k8s 集群内pod 连接集群外MySQL k8s 多集群_sed_06

得到访问的token

[root@master1 ~]# kubectl -n kube-system describe secret  admin-user
Name:         admin-user-token-2dfjt
Namespace:    kube-system
Labels:       <none>
Annotations:  kubernetes.io/service-account.name: admin-user
              kubernetes.io/service-account.uid: 245ac66a-eac5-4f2d-9f22-aeec8ddb84ac

Type:  kubernetes.io/service-account-token

Data
====
ca.crt:     1350 bytes
namespace:  11 bytes
token:      eyJhbGciOiJSUzI1NiIsImtpZCI6Ik1OX2NRbHNQZnZJUTBHYW9faThHck1qdkM3M0VITHBuV0NCZ3ZtZXZ1NU0ifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJhZG1pbi11c2VyLXRva2VuLTJkZmp0Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQubmFtZSI6ImFkbWluLXVzZXIiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC51aWQiOiIyNDVhYzY2YS1lYWM1LTRmMmQtOWYyMi1hZWVjOGRkYjg0YWMiLCJzdWIiOiJzeXN0ZW06c2VydmljZWFjY291bnQ6a3ViZS1zeXN0ZW06YWRtaW4tdXNlciJ9.LNKODuXzfcgvYGT5lRvs43kLE28fboTqtLv-0SBMR0GAJbw2Fr1BL85kSe2LsCgwuYJO73dMJSunv7lgrme2iUSIqnly81NJS5STO_TLI-JSdgqEeeb4peQiipmUS57cpk2x8tlD3SOGlGp0ccf13wDT4GqKTpg3GoO_NzfajTTD6vUW-pcPJdGRZli8OgXh5Zg2ubG5OpAXbHWXs0RB1chaIroNCcLj6tofTgD7G-PX44HL9zENDQp5Z4l-ZQWBY-qxpZX9DDR162mqUnes_DtOIPDzfmfnp5BSwFGymLrYKH7VIr0d2C2bjqKwTLurFONW6KnXejvL7-vPIbexPg

k8s 集群内pod 连接集群外MySQL k8s 多集群_容器_07

登录成功

k8s 集群内pod 连接集群外MySQL k8s 多集群_docker_08

添加master、node节点

Master——>Node

新增kube-master节点大致流程为:tools/03.addmaster.yml
[可选]新节点安装 chrony 时间同步
新节点预处理 prepare
新节点安装 docker 服务
新节点安装 kube-master 服务
新节点安装 kube-node 服务
新节点安装网络插件相关
禁止业务 pod调度到新master节点
更新 node 节点 haproxy 负载均衡并重启

添加master节点

需要在有ansible的master节点执行:
# ssh-copy-id 主机名
# vim /etc/ansible/hosts             # 在 “[kube-master]” 字段下面添加主机配置
# easzctl add-master 192.168.2.15    # 添加master;del-master,删除master

查看集群
[root@master1 ansible]# kubectl get nodes | grep 192.168.2.15 
NAME           STATUS                     ROLES    AGE   VERSION
192.168.2.15   Ready,SchedulingDisabled   master   76m   v1.20.2
节点已加入

删除master节点

删除kube_master节点大致流程为:(参考ezctl 中del-master函数和playbooks/33.delmaster.yml)

思路:

检测是否可以删除
迁移节点 pod
删除 master 相关服务及文件
删除 node 相关服务及文件
从集群删除 node 节点
从 ansible hosts 移除节点
在 ansible 控制端更新 kubeconfig
更新 node 节点 haproxy 配置

操作步骤
# ezctl del-master  192.168.1.15  # 假设待删除节点 192.168.1.11
验证

添加node节点

需要在有ansible的master节点执行:
# ssh-copy-id 主机名
# vim /etc/ansible/hosts             # 在 “[kube-node]” 字段下面添加主机配置
# easzctl add-node 192.168.3.16     # del-node,删除node

查看集群
[root@master1 ansible]# kubectl get nodes | grep 192.168.2.16 
NAME           STATUS                     ROLES    AGE   VERSION
192.168.2.16   Ready                      node     76m   v1.20.2
节点已加入

删除node节点

删除 node 节点流程:(参考ezctl 里面del-node函数 和 playbooks/32.delnode.yml)

思路:

检测是否可以删除
迁移节点上的 pod
删除 node 相关服务及文件
从集群删除 node

操作步骤:
# easzctl del-node 192.168.1.16   #假设待删除节点为 192.168.1.11
验证

升级集群

升级集群
https://github.com/easzlab/kubeasz/blob/master/docs/op/upgrade.md
升级集群 #未尝试
1)备份etcd
ETCDCTL_API=3 etcdctl snapshot save backup.db
查看备份文件信息
ETCDCTL_API=3 etcdctl --write-out=table snapshot status backup.db
2)到本项目的根目录kubeasz
 cd  /dir/to/kubeasz
拉取最新的代码
git pull origin master
3)下载升级目标版本的kubernetes二进制包,并替换/etc/ansible/bin/下的二进制文件
4)docker升级(略),除非特别需要,否则不建议频繁升级docker 
5)如果接受业务中断,执行:
ansible-playbook -t upgrade_k8s,restart_dockerd 22.upgrade.yml
6)不能接受短暂中断,需要这样做:
  a)ansible-playbook -t upgrade_k8s 22.upgrade.yml 
  b)到所有node上逐一:
kubectl cordon和kubectl drain //迁移业务pod 
systemctl restart docker
kubectl uncordon //恢复pod

备份和恢复

参考地址