centos7停止k8s源

转载

mob64ca140b0bc8 2024-10-30 10:22:54

文章标签 centos7停止k8s源 linux kubernetes docker Docker 文章分类 架构后端开发

1.linux服务器的基本配置

1.1 修改主机名称（可选）

hostnamectl set-hostname k8s-master (名字自定义)

1.2 配置阿里镜像源（可选）

wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-7.repo

1.3 安装一些工具（可选）

yum -y update
yum -y install vim tree wget net-toolsyum-utils git

1.4 设置域名解析（可选）

cat >> /etc/hosts <<EOF
124.223.189.139 k8s.minikube.com k8s-minikube
EOF

1.5 调整一些系统项

1.关闭防火墙
systemctl disable --now NetworkManager

2.禁用SELinux
setenforce 0
sed -i 's#SELINUX=enforcing#SELINUX=disabled#g' /etc/selinux/config

3.禁用swap
sed -ri 's/.*swap.*/#&/' /etc/fstab
swapoff -a

1.6 升级系统内核（Centos7）

1.导入仓库源
rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org
rpm -Uvh http://www.elrepo.org/elrepo-release-7.0-2.el7.elrepo.noarch.rpm
2.查看可安装的软件包（可以不看）
yum --enablerepo="elrepo-kernel" list --showduplicates | sort -r | grep kernel-ml.x86_64
3.选择 ML 或 LT 版本安装
# 安装 ML 版本
yum --enablerepo=elrepo-kernel install  kernel-ml-devel kernel-ml -y   
# 安装 LT 版本，K8S全部选这个
yum --enablerepo=elrepo-kernel install kernel-lt-devel kernel-lt -y
4.查看现有内核启动顺序
awk -F\' '$1=="menuentry " {print $2}' /etc/grub2.cfg
5.修改默认启动项
注意：xxx 为序号数字，以指定启动列表中第x项为启动项，x从0开始计数
grub2-set-default xxx
6.例如设置以4.4内核启动
# 查看内核启动序号
执行命令：awk -F\' '$1=="menuentry " {print $2}' /etc/grub2.cfg
CentOS Linux (4.4.179-1.el7.elrepo.x86_64) 7 (Core)
CentOS Linux (3.10.0-693.el7.x86_64) 7 (Core)
CentOS Linux (0-rescue-6d4c599606814867814f1a8eec7bfd1e) 7 (Core)
# 设置启动序号
grub2-set-default 0  （选第一个4.4的）
7.重启
reboot
8.检查内核版本
uname -r

1.7 装ipvs相关工具并优化内核（k8s集群选项）

yum -y install ipvsadm ipset sysstat conntrack libseccomp

1.在内核4.19+版本nf_conntrack_ipv4已经改为nf_conntrack， 4.18以下使用nf_conntrack_ipv4即可
cat >> /etc/modules-load.d/ipvs.conf <<EOF
ip_vs
ip_vs_lc
ip_vs_wlc
ip_vs_rr
ip_vs_wrr
ip_vs_lblc
ip_vs_lblcr
ip_vs_dh
ip_vs_sh
ip_vs_fo
ip_vs_nq
ip_vs_sed
ip_vs_ftp
ip_vs_sh
nf_conntrack #内核小于4.18，把这行改成nf_conntrack_ipv4
ip_tables
ip_set
xt_set
ipt_set
ipt_rpfilter
ipt_REJECT
ipip
EOF

然后执行systemctl enable --now systemd-modules-load.service即可

2.开启一些k8s集群中必须的内核参数，master和node节点配置k8s内核
cat > /etc/sysctl.d/k8s.conf <<EOF 
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
fs.may_detach_mounts = 1
vm.overcommit_memory=1
vm.panic_on_oom=0
fs.inotify.max_user_watches=89100
fs.file-max=52706963
fs.nr_open=52706963
net.netfilter.nf_conntrack_max=2310720

net.ipv4.tcp_keepalive_time = 600
net.ipv4.tcp_keepalive_probes = 3
net.ipv4.tcp_keepalive_intvl =15
net.ipv4.tcp_max_tw_buckets = 36000
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_max_orphans = 327680
net.ipv4.tcp_orphan_retries = 3
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_max_syn_backlog = 16384
net.ipv4.ip_conntrack_max = 65536
net.ipv4.tcp_max_syn_backlog = 16384
net.ipv4.tcp_timestamps = 0
net.core.somaxconn = 16384
EOF

执行：sysctl --system

Kubernetes内核优化常用参数详解

net.ipv4.ip_forward = 1 #其值为0,说明禁止进行IP转发；如果是1,则说明IP转发功能已经打开。
net.bridge.bridge-nf-call-iptables = 1 #二层的网桥在转发包时也会被iptables的FORWARD规则所过滤，这样有时会出现L3层的iptables rules去过滤L2的帧的问题
net.bridge.bridge-nf-call-ip6tables = 1 #是否在ip6tables链中过滤IPv6包 
fs.may_detach_mounts = 1 #当系统有容器运行时，需要设置为1

vm.overcommit_memory=1  
#0， 表示内核将检查是否有足够的可用内存供应用进程使用；如果有足够的可用内存，内存申请允许；否则，内存申请失败，并把错误返回给应用进程。
#1， 表示内核允许分配所有的物理内存，而不管当前的内存状态如何。
#2， 表示内核允许分配超过所有物理内存和交换空间总和的内存

vm.panic_on_oom=0 
#OOM就是out of memory的缩写，遇到内存耗尽、无法分配的状况。kernel面对OOM的时候，咱们也不能慌乱，要根据OOM参数来进行相应的处理。
#值为0：内存不足时，启动 OOM killer。
#值为1：内存不足时，有可能会触发 kernel panic（系统重启），也有可能启动 OOM killer。
#值为2：内存不足时，表示强制触发 kernel panic，内核崩溃GG（系统重启）。

fs.inotify.max_user_watches=89100 #表示同一用户同时可以添加的watch数目（watch一般是针对目录，决定了同时同一用户可以监控的目录数量）

fs.file-max=52706963 #所有进程最大的文件数
fs.nr_open=52706963 #单个进程可分配的最大文件数
net.netfilter.nf_conntrack_max=2310720 #连接跟踪表的大小，建议根据内存计算该值CONNTRACK_MAX = RAMSIZE (in bytes) / 16384 / (x / 32)，并满足nf_conntrack_max=4*nf_conntrack_buckets，默认262144

net.ipv4.tcp_keepalive_time = 600  #KeepAlive的空闲时长，或者说每次正常发送心跳的周期，默认值为7200s（2小时）
net.ipv4.tcp_keepalive_probes = 3 #在tcp_keepalive_time之后，没有接收到对方确认，继续发送保活探测包次数，默认值为9（次）
net.ipv4.tcp_keepalive_intvl =15 #KeepAlive探测包的发送间隔，默认值为75s
net.ipv4.tcp_max_tw_buckets = 36000 #Nginx 之类的中间代理一定要关注这个值，因为它对你的系统起到一个保护的作用，一旦端口全部被占用，服务就异常了。 tcp_max_tw_buckets 能帮你降低这种情况的发生概率，争取补救时间。
net.ipv4.tcp_tw_reuse = 1 #只对客户端起作用，开启后客户端在1s内回收
net.ipv4.tcp_max_orphans = 327680 #这个值表示系统所能处理不属于任何进程的socket数量，当我们需要快速建立大量连接时，就需要关注下这个值了。

net.ipv4.tcp_orphan_retries = 3
#出现大量fin-wait-1
#首先，fin发送之后，有可能会丢弃，那么发送多少次这样的fin包呢？fin包的重传，也会采用退避方式，在2.6.358内核中采用的是指数退避，2s，4s，最后的重试次数是由tcp_orphan_retries来限制的。

net.ipv4.tcp_syncookies = 1 #tcp_syncookies是一个开关，是否打开SYN Cookie功能，该功能可以防止部分SYN攻击。tcp_synack_retries和tcp_syn_retries定义SYN的重试次数。
net.ipv4.tcp_max_syn_backlog = 16384 #进入SYN包的最大请求队列.默认1024.对重负载服务器,增加该值显然有好处.
net.ipv4.ip_conntrack_max = 65536 #表明系统将对最大跟踪的TCP连接数限制默认为65536
net.ipv4.tcp_max_syn_backlog = 16384 #指定所能接受SYN同步包的最大客户端数量，即半连接上限；
net.ipv4.tcp_timestamps = 0 #在使用 iptables 做 nat 时，发现内网机器 ping 某个域名 ping 的通，而使用 curl 测试不通, 原来是 net.ipv4.tcp_timestamps 设置了为 1 ，即启用时间戳
net.core.somaxconn = 16384	#Linux中的一个kernel参数，表示socket监听（listen）的backlog上限。什么是backlog呢？backlog就是socket的监听队列，当一个请求（request）尚未被处理或建立时，他会进入backlog。而socket server可以一次性处理backlog中的所有请求，处理后的请求不再位于监听队列中。当server处理请求较慢，以至于监听队列被填满后，新来的请求会被拒绝。

重启，检查内核是否加载

reboot
lsmod | grep --color=auto -e ip_vs -e nf_conntrack

2.安装Docker

提示：如果你选择的容器运行时是Docker，那么你就装一下。

2.1 清理旧的数据包

如果你的是新机子，没有安装过docker，忽略此步骤

sudo yum remove docker \
                  docker-client \
                  docker-client-latest \
                  docker-common \
                  docker-latest \
                  docker-latest-logrotate \
                  docker-logrotate \
                  docker-engine

2.2 Install Docker Engine（安装docker）

sudo yum-config-manager \
    --add-repo \
    https://download.docker.com/linux/centos/docker-ce.repo

sudo yum install docker-ce docker-ce-cli containerd.io docker-compose-plugin
# 查看版本：一堆
yum list docker-ce --showduplicates | sort -r
# 直接装最新的  <VERSION_STRING>  改为版本号：20.10.9
sudo yum install docker-ce-<VERSION_STRING> docker-ce-cli-<VERSION_STRING> containerd.io docker-compose-plugin
sudo yum install docker-ce-20.10.9 docker-ce-cli-20.10.9 containerd.io docker-compose-plugin

2.3 启动Dcoker

sudo systemctl start docker
# 设置开机自动启动
sudo systemctl enable docker

2.4 查看启动状态

sudo systemctl status docker
● docker.service - Docker Application Container Engine
   Loaded: loaded (/usr/lib/systemd/system/docker.service; disabled; vendor preset: disabled)
   Active: active (running) since Thu 2022-10-13 14:03:49 CST; 10s ago
     Docs: https://docs.docker.com
 Main PID: 8755 (dockerd)
    Tasks: 9
   Memory: 31.0M
   CGroup: /system.slice/docker.service
           └─8755 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock

启动成功标识：active (running)

2.5 修改docker配置文件

cat > /etc/docker/daemon.json <<-EOF
{
    "registry-mirrors": [
        "https://registry.docker-cn.com",
        "http://hub-mirror.c.163.com",
        "https://docker.mirrors.ustc.edu.cn"
    ],
    "exec-opts": ["native.cgroupdriver=systemd"],
    "max-concurrent-downloads": 10,
    "max-concurrent-uploads": 5,
    "log-opts": {
        "max-size": "300m",
        "max-file": "2"  
    },
    "live-restore": true
}
EOF

# 使配置生效，重启Docker
systemctl daemon-reload
systemctl restart docker

2.6 查看Docker版本详情，驱动是否配置成功

执行： docker info                                                                        
Client:                                                                                                     
 Context:    default                                                                                        
 Debug Mode: false                                                                                          
 Plugins:                                                                                                   
  app: Docker App (Docker Inc., v0.9.1-beta3)                                                               
  buildx: Docker Buildx (Docker Inc., v0.9.1-docker)                                                        
  compose: Docker Compose (Docker Inc., v2.10.2)                                                            
  scan: Docker Scan (Docker Inc., v0.17.0)                                                                  
                                                                                                            
Server:                                                                                                     
 Containers: 0                                                                                              
  Running: 0                                                                                                
  Paused: 0                                                                                                 
  Stopped: 0                                                                                                
 Images: 0                                                                                                  
 Server Version: 20.10.18                                                                                   
 Storage Driver: overlay2                                                                                   
  Backing Filesystem: extfs                                                                                 
  Supports d_type: true                                                                                     
  Native Overlay Diff: true                                                                                 
  userxattr: false                                                                                          
 Logging Driver: json-file                                                                                  
 Cgroup Driver: systemd （说明配置成功）                                                                             
 Cgroup Version: 1                                                                                          
 Plugins:                                                                                                   
  Volume: local                                                                                             
  Network: bridge host ipvlan macvlan null overlay                                                          
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog                       
 Swarm: inactive                                                                                            
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc                                        
 Default Runtime: runc                                                                                      
 Init Binary: docker-init                                                                                   
 containerd version: 9cd3357b7fd7218e4aec3eae239db1f68a5a6ec6                                               
 runc version: v1.1.4-0-g5fd4c4d                                                                            
 init version: de40ad0                                                                                      
 Security Options:                                                                                          
  seccomp                                                                                                   
   Profile: default                                                                                         
 Kernel Version: 5.4.217-1.el7.elrepo.x86_64                                                                
 Operating System: CentOS Linux 7 (Core)                                                                    
 OSType: linux                                                                                              
 Architecture: x86_64                                                                                       
 CPUs: 4                                                                                                    
 Total Memory: 7.527GiB                                                                                     
 Name: VM-16-4-centos                                                                                       
 ID: MSPA:EEHZ:ZUAY:FJS5:KM6T:FS25:BEBX:I5VL:VKTR:UIKZ:CO2I:KNRX                                            
 Docker Root Dir: /var/lib/docker                                                                           
 Debug Mode: false                                                                                          
 Registry: https://index.docker.io/v1/                                                                      
 Labels:                                                                                                    
 Experimental: false                                                                                        
 Insecure Registries:                                                                                       
  127.0.0.0/8                                                                                               
 Registry Mirrors:                                                                                          
  https://registry.docker-cn.com/                                                                           
  http://hub-mirror.c.163.com/                                                                              
  https://docker.mirrors.ustc.edu.cn/                                                                       
 Live Restore Enabled: true

2.7 添加一个docker用户

sudo groupadd docker
sudo adduser docker-user
sudo passwd docker-user
# Docker添加用户组
sudo usermod  -aG docker docker-user

3.CRI-Docker安装

提示：如果K8s选的容器运行时不是Docker，并且K8s版本>=1.24时，这个是必须的安装的。因为K8s和Docker直接的引用是Dokcer提供连接给K8s，K8s中没有内嵌了！
描述：
Kubernetes自v1.24后移除了对docker-shim的支持，而Docker Engine默认又不支持CRI规范，因而二者将无法直接完成整合。为此，Mirantis和Docker联合创建了cri-dockerd项目，用于为Docker Engine提供一个能够支持到CRI规范的垫片，从而能够让Kubernetes基于CRI控制Docker 。

3.1 下载文件并解压

# 下载会很慢，自行找个代理下载即可
wget https://github.com/Mirantis/cri-dockerd/releases/download/v0.2.5/cri-dockerd-0.2.5.amd64.tgz
tar xf cri-dockerd-0.2.5.amd64.tgz
mv cri-dockerd/* /usr/bin/
# 查看是否移动成功
ll /usr/bin/cri-dockerd

3.2 创建服务文件

# 1.创建文件cri-docker.service
cat > /usr/lib/systemd/system/cri-docker.service <<EOF
[Unit]
Description=CRI Interface for Docker Application Container Engine
Documentation=https://docs.mirantis.com
After=network-online.target firewalld.service docker.service
Wants=network-online.target
Requires=cri-docker.socket

[Service]
Type=notify
ExecStart=/usr/bin/cri-dockerd --container-runtime-endpoint fd://
ExecReload=/bin/kill -s HUP $MAINPID
TimeoutSec=0
RestartSec=2
Restart=always

# Note that StartLimit* options were moved from "Service" to "Unit" in systemd 229.
# Both the old, and new location are accepted by systemd 229 and up, so using the old location
# to make them work for either version of systemd.
StartLimitBurst=3

# Note that StartLimitInterval was renamed to StartLimitIntervalSec in systemd 230.
# Both the old, and new name are accepted by systemd 230 and up, so using the old name to make
# this option work for either version of systemd.
StartLimitInterval=60s

# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity

# Comment TasksMax if your systemd version does not support it.
# Only systemd 226 and above support this option.
TasksMax=infinity
Delegate=yes
KillMode=process

[Install]
WantedBy=multi-user.target
EOF

# 2.创建文件cri-docker.socket
cat > /usr/lib/systemd/system/cri-docker.socket <<EOF
[Unit]
Description=CRI Docker Socket for the API
PartOf=cri-docker.service

[Socket]
ListenStream=%t/cri-dockerd.sock
SocketMode=0660
SocketUser=root
SocketGroup=docker

[Install]
WantedBy=sockets.target
EOF

# 3.修改cri-docker.service文件内容
sed -ri '/ExecStart.*/s@(ExecStart.*)@\1 --pod-infra-container-image registry.aliyuncs.com/google_containers/pause:3.8@g' /lib/systemd/system/cri-docker.service

3.3 启动CRI-Docker

systemctl daemon-reload && systemctl enable --now cri-docker

其他操作：
systemctl start cri-docker
systemctl enable cri-docker
systemctl status cri-docker

3.4 检查启动状态

systemctl status cri-docker                                            
● cri-docker.service - CRI Interface for Docker Application Container Engine                                
   Loaded: loaded (/usr/lib/systemd/system/cri-docker.service; enabled; vendor preset: disabled)            
   Active: active (running) since Thu 2022-10-13 14:22:07 CST; 6s ago                                       
     Docs: https://docs.mirantis.com                                                                        
 Main PID: 11713 (cri-dockerd)                                                                              
    Tasks: 9                                                                                                
   Memory: 15.6M                                                                                            
   CGroup: /system.slice/cri-docker.service                                                                 
           └─11713 /usr/bin/cri-dockerd --container-runtime-endpoint fd:// --pod-infra-container-image re...
 
启动成功标识：active (running)

本文章为转载内容，我们尊重原作者对文章享有的著作权。如有内容错误或侵权问题，欢迎原作者联系我们进行内容更正或删除文章。