节点1:node1  192.168.88.21
节点2:node2  192.168.88.22
节点2:node2  192.168.88.23

Docker: version 20.10.9   (不能高于20版本)
kubectl: v1.23.0
kubesphere  v3.3.0
Kubernetes和Docker的主版本号应该保持一致。例如,如果使用Kubernetes v1.18,则应该使用Docker v18.x。

####################################################################### 内核升级
内核 3.10.0-1160.108.1.el7.x86_64 升级至少4.0版本以上

rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org
rpm -Uvh https://www.elrepo.org/elrepo-release-7.el7.elrepo.noarch.rpm
yum --enablerepo=elrepo-kernel install kernel-ml-devel kernel-ml-headers kernel-ml -y
grub2-set-default 0
grub2-mkconfig -o /boot/grub2/grub.cfg
reboot
uname -sr

# 升级完成后查看内核,已经满足条件
[root@k8s-a-node10 ~]# uname -sr
Linux 6.2.9-1.el7.elrepo.x86_64

# 看看k8s是否正常,我的没问题,非常顺利。
kubectl get nodes
kubectl get pod -n kube-system

/usr/sbin/modprobe rbd 
echo "/usr/sbin/modprobe rbd " >> /etc/rc.local
chmod -R 755 /etc/rc.d/rc.local

#######################################################################

参考网址:  https://zhuanlan.zhihu.com/p/627310856  

docker对systemd的版本要求,在centos7环境下,systemd好像是219版本,我出问题的环境也是219版本
可以重启一下服务 不要先重启kubelet,先重启kubelet会导致所有的pod处于pending状态!
 systemctl restart docker
 systemctl restart kubelet

#############

设置hostname(以node1为例):

设置dnsname :(aliyun)  
nameserver 223.5.5.5  
nameserver 223.6.6.6

hostnamectl set-hostname  node1  # node1 是自定义名字
或者修改 /etc/hostname 文件,写入node1(其他的子节点都一样):

vim /etc/hostname
修改之后/etc/hostname的内容为:

node1
所有节点执行时间同步:

# 启动chronyd 时间同步服务
systemctl start chronyd
systemctl enable chronyd
date
所有节点禁用SELinux和Firewalld服务:

systemctl stop firewalld
systemctl disable firewalld

sed -i 's/enforcing/disabled/' /etc/selinux/config # 重启后生效
所有节点禁用swap分区:

# 临时禁用swap分区
swapoff -a

# 永久禁用swap分区
vi /etc/fstab 
# 注释掉下面的设置
# /dev/mapper/centos-swap swap
# 之后需要重启服务器生效
所有节点添加网桥过滤和地址转发功能:

cat > /etc/sysctl.d/kubernetes.conf << EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF

# 然后执行,生效
sysctl --system

##########################################################################################

在线安装docker-ce

yum install -y yum-utils \
           device-mapper-persistent-data \
           lvm2 
       ###--skip-broken

# 设置docker镜像源
yum-config-manager \
    --add-repo \
    https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo     
sed -i 's/download.docker.com/mirrors.aliyun.com\/docker-ce/g' /etc/yum.repos.d/docker-ce.repo

yum makecache fast
  #docker 与k8s 版本需要注意一下
#kubelet-1.23.0   # docker 版本不能高于 20
#docker-ce-20.10.9  
#yum list docker-ce --showduplicates | sort -r

rpm -e docker-buildx-plugin-0:0.12.1-1.el7.x86_64

yum install -y bash-completion  nfs-utils
yum install -y docker-ce-20.10.9    docker-ce-cli-20.10.9   docker-compose-plugin-2.20.2  
 
###docker-ce    3:25.0.3-1.el7        docker-ce-stable   版本过高不建议 

##########################################################################################

需要注意的是要配置docker的cgroupdriver:

mkdir -p /etc/docker/
 sudo tee /etc/docker/daemon.json <<-'EOF'
 {
   "registry-mirrors": ["https://82m9ar63.mirror.aliyuncs.com"],
   "exec-opts": ["native.cgroupdriver=systemd"],
   "log-driver": "json-file",
   "log-opts": {
     "max-size": "100m"
   },
   "storage-driver": "overlay2"
 }
 EOF

 systemctl daemon-reload
 systemctl restart docker
 systemctl enable docker.service

##########################################################################################
所有节点的kubernetes镜像切换成国内源:

cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
# 是否开启本仓库
enabled=1
# 是否检查 gpg 签名文件
gpgcheck=0
# 是否检查 gpg 签名文件
repo_gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg

EOF
 

#######################################################
所有节点安装指定版本 kubeadm,kubelet 和 kubectl(我这里选择1.23.0版本的):

yum install -y kubelet-1.23.0 kubeadm-1.23.0 kubectl-1.23.0

# 设置kubelet开机启动(看你自己)
systemctl enable kubelet

 
##########################################################################################
1.2 *更改kubelet的容器路径(如果需要的话,不需要可以跳过)
vim /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf
修改完之后配置文件如下:

[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --root-dir=/mnt/sdb_new/kubelet/ --kubeconfig=/etc/kubernetes/kubelet.conf"
使配置生效:

systemctl daemon-reload
systemctl restart docker

systemctl restart kubelet

systemctl enable docker
systemctl enable kubelet

 
##########################################################################################
覆盖kubernetes的镜像地址(只需要在master节点上操作初始化命令)

1. 首先要覆盖kubeadm的镜像地址,因为这个是外网的无法访问,需要替换成国内的镜像地址,使用此命令列出集群在配置过程中需要哪些镜像:

kubeadm config images list
kubeadm config images list  --image-repository registry.aliyuncs.com/google_containers
 
 更改为阿里云的镜像地址:

kubeadm init \
  --apiserver-advertise-address=192.168.88.21 \
  --image-repository registry.aliyuncs.com/google_containers \
  --kubernetes-version v1.23.0 \
  --service-cidr=10.96.0.0/12 \
  --pod-network-cidr=10.244.0.0/16 \
  --ignore-preflight-errors=all

# –apiserver-advertise-address # 集群通告地址(master 机器IP,这里用的万兆网)
# –image-repository # 由于默认拉取镜像地址k8s.gcr.io国内无法访问,这里指定阿里云镜像仓库地址
# –kubernetes-version #K8s版本,与上面安装的一致
# –service-cidr #集群内部虚拟网络,Pod统一访问入口,可以不用更改,直接用上面的参数
# –pod-network-cidr #Pod网络,与下面部署的CNI网络组件yaml中保持一致,可以不用更改,直接用上面的参数

kubeadm config images list

《========================================Your Kubernetes control-plane has initialized successfully! 《========================================

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

#这段要复制记录下来(来自k8s初始化成功之后出现的join命令,需要先配置完Flannel才能加入子节点),
#后续子节点加入master节点需要执行这段命令:

kubeadm join 192.168.88.21:6443 --token 5ftb6m.79xz124nx3n4u69v \
    --discovery-token-ca-cert-hash sha256:bab814a71242fec19f3f693038be05656698c3c6d4054657b52ca7d8e3b9138f 

### root 账户需要配置以下命令:
需要先安装如下:
yum install bash-completion

vi  /root/.bash_profile
加入以下这段:

# 超级用户变量
export KUBECONFIG=/etc/kubernetes/admin.conf
# 设置别名
alias k=kubectl
# 设置kubectl命令补齐功能
source <(kubectl completion bash)

[root@node1 home]# source /root/.bash_profile

这段要复制记录下来(来自k8s初始化成功之后出现的join命令,需要先配置完Flannel才能加入子节点),后续子节点加入master节点需要执行这段命令:

#################################################  设定kubeletl网络(主节点部署)   #################################################   

下载kube-flannel.yml:

[root@node1 home]# wget https://raw.githubusercontent.com/flannel-io/flannel/master/Documentation/kube-flannel.yml

文件内容如下:

########################
 vi kube-flannel.yml---
 kind: Namespace
 apiVersion: v1
 metadata:
   name: kube-flannel
   labels:
     k8s-app: flannel
     pod-security.kubernetes.io/enforce: privileged
 ---
 kind: ClusterRole
 apiVersion: rbac.authorization.k8s.io/v1
 metadata:
   labels:
     k8s-app: flannel
   name: flannel
 rules:
 - apiGroups:
   - ""
   resources:
   - pods
   verbs:
   - get
 - apiGroups:
   - ""
   resources:
   - nodes
   verbs:
   - get
   - list
   - watch
 - apiGroups:
   - ""
   resources:
   - nodes/status
   verbs:
   - patch
 - apiGroups:
   - networking.k8s.io
   resources:
   - clustercidrs
   verbs:
   - list
   - watch
 ---
 kind: ClusterRoleBinding
 apiVersion: rbac.authorization.k8s.io/v1
 metadata:
   labels:
     k8s-app: flannel
   name: flannel
 roleRef:
   apiGroup: rbac.authorization.k8s.io
   kind: ClusterRole
   name: flannel
 subjects:
 - kind: ServiceAccount
   name: flannel
   namespace: kube-flannel
 ---
 apiVersion: v1
 kind: ServiceAccount
 metadata:
   labels:
     k8s-app: flannel
   name: flannel
   namespace: kube-flannel
 ---
 kind: ConfigMap
 apiVersion: v1
 metadata:
   name: kube-flannel-cfg
   namespace: kube-flannel
   labels:
     tier: node
     k8s-app: flannel
     app: flannel
 data:
   cni-conf.json: |
     {
       "name": "cbr0",
       "cniVersion": "0.3.1",
       "plugins": [
         {
           "type": "flannel",
           "delegate": {
             "hairpinMode": true,
             "isDefaultGateway": true
           }
         },
         {
           "type": "portmap",
           "capabilities": {
             "portMappings": true
           }
         }
       ]
     }
   net-conf.json: |
     {
       "Network": "10.244.0.0/16",
       "Backend": {
         "Type": "vxlan"
       }
     }
 ---
 apiVersion: apps/v1
 kind: DaemonSet
 metadata:
   name: kube-flannel-ds
   namespace: kube-flannel
   labels:
     tier: node
     app: flannel
     k8s-app: flannel
 spec:
   selector:
     matchLabels:
       app: flannel
   template:
     metadata:
       labels:
         tier: node
         app: flannel
     spec:
       affinity:
         nodeAffinity:
           requiredDuringSchedulingIgnoredDuringExecution:
             nodeSelectorTerms:
             - matchExpressions:
               - key: kubernetes.io/os
                 operator: In
                 values:
                 - linux
       hostNetwork: true
       priorityClassName: system-node-critical
       tolerations:
       - operator: Exists
         effect: NoSchedule
       serviceAccountName: flannel
       initContainers:
       - name: install-cni-plugin
         image: docker.io/flannel/flannel-cni-plugin:v1.4.0-flannel1
         command:
         - cp
         args:
         - -f
         - /flannel
         - /opt/cni/bin/flannel
         volumeMounts:
         - name: cni-plugin
           mountPath: /opt/cni/bin
       - name: install-cni
         image: docker.io/flannel/flannel:v0.24.2
         command:
         - cp
         args:
         - -f
         - /etc/kube-flannel/cni-conf.json
         - /etc/cni/net.d/10-flannel.conflist
         volumeMounts:
         - name: cni
           mountPath: /etc/cni/net.d
         - name: flannel-cfg
           mountPath: /etc/kube-flannel/
       containers:
       - name: kube-flannel
         image: docker.io/flannel/flannel:v0.24.2
         command:
         - /opt/bin/flanneld
         args:
         - --ip-masq
         - --kube-subnet-mgr
         resources:
           requests:
             cpu: "100m"
             memory: "50Mi"
         securityContext:
           privileged: false
           capabilities:
             add: ["NET_ADMIN", "NET_RAW"]
         env:
         - name: POD_NAME
           valueFrom:
             fieldRef:
               fieldPath: metadata.name
         - name: POD_NAMESPACE
           valueFrom:
             fieldRef:
               fieldPath: metadata.namespace
         - name: EVENT_QUEUE_DEPTH
           value: "5000"
         volumeMounts:
         - name: run
           mountPath: /run/flannel
         - name: flannel-cfg
           mountPath: /etc/kube-flannel/
         - name: xtables-lock
           mountPath: /run/xtables.lock
       volumes:
       - name: run
         hostPath:
           path: /run/flannel
       - name: cni-plugin
         hostPath:
           path: /opt/cni/bin
       - name: cni
         hostPath:
           path: /etc/cni/net.d
       - name: flannel-cfg
         configMap:
           name: kube-flannel-cfg
       - name: xtables-lock
         hostPath:
           path: /run/xtables.lock
           type: FileOrCreate#############################
然后修改配置文件,找到如下位置,修改 Newwork 与执行 kubeadm init 输入的网段一致:
net-conf.json: |
   {
     "Network": "10.244.0.0/16",
     "Backend"": {
       "Type": "vxlan"
   }
 }

修改配置之后安装组件(如果安装的时候卡在pull镜像的时候,试一试手动用docker将镜像拉取下来):

[root@node1 home]# kubectl apply -f kube-flannel.yml

查看flannel pod状态(必须要为Running状态,如果kube-flannel起不来,那么就用kubectl describe pod kube-flannel-ds-f5jn6 -n kube-flannel命令查看pod起不来的原因,
然后去搜度娘获取解决方案):

[root@node1 home]# # 必须所有的容器都是Running
[root@node1 home]# kubectl get pod --all-namespaces
NAMESPACE      NAME                                 READY   STATUS    RESTARTS   AGE
kube-flannel   kube-flannel-ds-f5jn6                1/1     Running   0          8m21s
kube-system    coredns-6d8c4cb4d-ctqw5              1/1     Running   0          42m
kube-system    coredns-6d8c4cb4d-n52fq              1/1     Running   0          42m
kube-system    etcd-k8s-master                      1/1     Running   0          42m
kube-system    kube-apiserver-k8s-master            1/1     Running   0          42m
kube-system    kube-controller-manager-k8s-master   1/1     Running   0          42m
kube-system    kube-proxy-swpkz                     1/1     Running   0          42m
kube-system    kube-scheduler-k8s-master            1/1     Running   0          42m
查看通信状态:

[root@node1 home]# kubectl get pod -n kube-system
NAME                                 READY   STATUS    RESTARTS   AGE
coredns-6d8c4cb4d-ctqw5              1/1     Running   0          52m
coredns-6d8c4cb4d-n52fq              1/1     Running   0          52m
etcd-k8s-master                      1/1     Running   0          53m
kube-apiserver-k8s-master            1/1     Running   0          53m
kube-controller-manager-k8s-master   1/1     Running   0          53m
kube-proxy-swpkz                     1/1     Running   0          52m
kube-scheduler-k8s-master            1/1     Running   0          53m
 
[root@node1 home]# 获取主节点的状态
[root@node1 home]# kubectl get cs
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME                 STATUS    MESSAGE                         ERROR
controller-manager   Healthy   ok
scheduler            Healthy   ok
etcd-0               Healthy   {"health":"true","reason":""}
[root@node1 home]# kubectl get node
NAME         STATUS   ROLES                  AGE   VERSION
node1        Ready    control-plane,master   52m   v1.23.0
查看节点状态(此时还只有主节点,还没添加子节点):

[root@node1 home]# kubectl get node
NAME         STATUS   ROLES                  AGE   VERSION
node1        Ready    control-plane,master   53m   v1.23.0
至此 K8s master主服务器 已经部署完成!

#################################################    1.3.4 子节点加入集群(在子节点上操作) #################################################   

初始化会生成join命令,需要在子节点执行即可,以下token作为举例,以实际为主,例如:

[root@node2 home]# kubeadm join 192.168.88.21:6443 --token 5ftb6m.79xz124nx3n4u69v \
    --discovery-token-ca-cert-hash sha256:bab814a71242fec19f3f693038be05656698c3c6d4054657b52ca7d8e3b9138f 

[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
默认的 join token 有效期限为24小时,当过期后该 token 就不能用了,这时需要重新创建 token,创建新的join token需要在主节点上创建,创建命令如下:

[root@node1 home]# kubeadm token create --print-join-command
加入之后再在主节点查看集群中节点的状态(必须要都为Ready状态):

[root@node1 home]# kubectl get nodes
NAME         STATUS     ROLES                  AGE     VERSION
node1        Ready      control-plane,master   63m     v1.23.0
node2        Ready      <none>                 3m57s   v1.23.0
node3        Ready      <none>                 29s     v1.23.0

如果所有的节点STATUS都为Ready的话,那么到此,所有的子节点加入完成!

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
## 在子节点需要执行 kubectl get nodes 查询命令:
执行如下命令接入集群:
kubeadm join 192.168.88.115:6443 --token gzay1h.1u0n8ugcs9adk1f0 \
    --discovery-token-ca-cert-hash sha256:94406ea0dba5d588f37c9ba9ffc8a3585f8526f37763ab0be002e129b9f9022b 

kubectl get nodes

在主节点上面执行命令 把/etc/kubernetes/admin.conf 传送到其他节点
1、对于任何节点上执行报错
[root@server-88-22 ~]# kubectl get nodes
The connection to the server localhost:8080 was refused - did you specify the right host or port?

在各个非主节点执行:

scp /etc/kubernetes/admin.conf user@host:/etc/kubernetes/admin.conf
 
user为主机登录用户
host为主机ip
然后执行:

echo "export KUBECONFIG=/etc/kubernetes/admin.conf" >> ~/.bash_profile
source ~/.bash_profile
 
 在子节点上执行命令kubectl get nodes:查看节点状态 

[root@node1 ~]# kubectl get nodes
NAME    STATUS     ROLES                  AGE   VERSION
node1   NotReady   control-plane,master   27m   v1.23.0
node2   NotReady   <none>                 10m   v1.23.0

#################################################    删除子节点  #################################################   

1.3.5 删除子节点(在master主节点上操作)
# kubectl drain <node name> --delete-local-data --force --ignore-daemonsets
# 其中 <node name> 是在k8s集群中使用 <kubectl get nodes> 查询到的节点名称
# 假设这里删除 node3 子节点
[root@node1 home]# kubectl drain node3 --delete-local-data --force --ignore-daemonsets
[root@node1 home]# kubectl delete node node3
然后在删除的子节点上操作重置k8s(重置k8s会删除一些配置文件),这里在node3子节点上操作:

[root@node3 home]# # 子节点重置k8s
[root@node3 home]# kubeadm reset
[reset] WARNING: Changes made to this host by 'kubeadm init' or 'kubeadm join' will be reverted.
[reset] Are you sure you want to proceed? [y/N]: y
[preflight] Running pre-flight checks
W0425 01:59:40.412616   15604 removeetcdmember.go:80] [reset] No kubeadm config, using etcd pod spec to get data directory
[reset] No etcd config found. Assuming external etcd
[reset] Please, manually reset etcd to prevent further issues
[reset] Stopping the kubelet service
[reset] Unmounting mounted directories in "/var/lib/kubelet"
[reset] Deleting contents of config directories: [/etc/kubernetes/manifests /etc/kubernetes/pki]
[reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf]
[reset] Deleting contents of stateful directories: [/var/lib/kubelet /var/lib/dockershim /var/run/kubernetes /var/lib/cni]

The reset process does not clean CNI configuration. To do so, you must remove /etc/cni/net.d

The reset process does not reset or clean up iptables rules or IPVS tables.
If you wish to reset iptables, you must do so manually by using the "iptables" command.

If your cluster was setup to utilize IPVS, run ipvsadm --clear (or similar)
to reset your system's IPVS tables.

The reset process does not clean your kubeconfig files and you must remove them manually.
Please, check the contents of the $HOME/.kube/config file.
然后在被删除的子节点上手动删除k8s配置文件、flannel网络配置文件 和 flannel网口:

[root@node3 home]# rm -rf /etc/cni/net.d/
[root@node3 home]# rm -rf /root/.kube/config
[root@node3 home]# # 删除cni网络
[root@node3 home]# ifconfig cni0 down
[root@node3 home]# ip link delete cni0
[root@node3 home]# ifconfig flannel.1 down
[root@node3 home]# ip link delete flannel.1

###############
命令笔记:
k8s常用命令集合:

# 查看当前集群的所有的节点
kubectl get node
# 显示 Node 的详细信息(一般用不着)
kubectl describe node node1

# 查看所有的pod
kubectl get pod --all-namespaces
# 查看pod的详细信息
kubectl get pods -o wide --all-namespaces

# 查看所有创建的服务
kubectl get service

# 查看所有的deploy
kubectl get deploy

# 重启 pod(这个方式会删除原来的pod,然后再重新生成一个pod达到重启的目的)
# 有yaml文件的重启
kubectl replace --force -f xxx.yaml
# 无yaml文件的重启
kubectl get pod <POD_NAME> -n <NAMESPACE> -o yaml | kubectl replace --force -f -

# 查看pod的详细信息
kubectl describe pod nfs-client-provisioner-65c77c7bf9-54rdp -n default

# 根据 yaml 文件创建Pod资源 用于创建或更新一个 Kubernetes 对象 apply 还提供了许多可选的参数,例如 --force、--validate、--record 等,可以使更新操作更加精确和可控
kubectl apply -f pod.yaml

# kubectl create -f  适用于初始化资源对象的场景;
用于创建 Kubernetes 对象。如果对应的资源已经存在,则会返回错误,此时需要先删除原有的资源对象,然后再执行创建操作。如果资源对象不存在,则会自动创建对应的资源对象
 

# 删除基于 pod.yaml 文件定义的Pod 
kubectl delete -f pod.yaml

# 查看容器的日志
kubectl logs <pod-name>
# 实时查看日志
kubectl logs -f <pod-name>
# 若 pod 只有一个容器,可以不加 -c
kubectl log  <pod-name> -c <container_name>
# 返回所有标记为 app=frontend 的 pod 的合并日志
kubectl logs -l app=frontend

# 通过bash获得 pod 中某个容器的TTY,相当于登录容器
# kubectl exec -it <pod-name> -c <container-name> -- bash
eg:
kubectl exec -it redis-master-cln81 -- bash

# 查看 endpoint 列表
kubectl get endpoints

# 查看已有的token
kubeadm token list

#################################################      安装动态存储  #################################################   


kubesphere/ks-installer:v3.3.0
kubectl  v1.23.0
docker:  20.10.9

您的 Kubernetes 版本必须为:v1.20.x、v1.21.x、* v1.22.x、* v1.23.x、* v1.24.x、* v1.25.x 和 * v1.26.x。带星号的版本可能出现边缘节点部分功能不可用的情况。因此,如需使用边缘节点,推荐安装 v1.21.x。
确保您的机器满足最低硬件要求:CPU > 1 核,内存 > 2 GB。
在安装之前,需要配置 Kubernetes 集群中的默认存储类型(这篇文章会介绍安装)。
 
我已经准备好了一个Kubernetes集群,如图
[root@server-88-21 ~]# kubectl  get nodes -o wide
NAME           STATUS   ROLES                  AGE   VERSION   INTERNAL-IP     EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION                 CONTAINER-RUNTIME
server-88-21   Ready    control-plane,master   26h   v1.23.0   192.168.88.21   <none>        CentOS Linux 7 (Core)   3.10.0-1160.108.1.el7.x86_64   docker://20.10.9
server-88-22   Ready    <none>                 26h   v1.23.0   192.168.88.22   <none>        CentOS Linux 7 (Core)   3.10.0-1160.108.1.el7.x86_64   docker://20.10.9

####NFS动态供给
首先你需要准备一台NFS服务器,为了方便,我这次就以我的主服务器 k8s-master 来担任这个NFS服务器了。

##搭建NFS

首先我们需要在NFS服务器(我的NFS服务器和master是同一台)和所有k8s节点当中安装 nfs-utils 软件包(master和node都需要安装),可执行下面这行命令:

yum install -y nfs-utils

# 创建这个目录
mkdir -p /data/nfs/dynamic-provisioner
# 执行这行命令将这个目录写到写到 /etc/exports 文件当中去,这样NFS会对局域网暴露这个目录
cat >> /etc/exports << EOF
/data/k8s *(rw,sync,no_root_squash)
EOF
# 启动NFS服务
systemctl enable  nfs
systemctl start  nfs

检查是否暴露成功:(其他节点也需要测试一下)
showmount -e {nfs服务器地址}

###下载动态供给驱动
因为Kubernetes自己不自带NFS动态供给的驱动,所以我们需要下载第三方的NFS动态供给驱动。Kubernetes官方推荐了两个第三方的驱动可供选择,如图:
个人觉得这个 NFS subdir 驱动比较好用,这次就用这个驱动来搭建动态供给了。我们可以来到它的官网
wget https://github.com/kubernetes-sigs/nfs-subdir-external-provisioner/archive/refs/tags/nfs-subdir-external-provisioner-4.0.18.tar.gz

cd nfs-subdir-external-provisioner-nfs-subdir-external-provisioner-4.0.18/deploy/

可以看到这里面有一些yaml,我们需要修改一部分:
# 这个镜像是在谷歌上的,国内拉取不到
# image: registry.k8s.io/sig-storage/nfs-subdir-external-provisioner:v4.0.2
# 使用这个我先在谷歌上拉取下来再上传到阿里云上的镜像
image: registry.cn-shenzhen.aliyuncs.com/xiaohh-docker/nfs-subdir-external-provisioner:v4.0.2

 
###deployment.yaml  注意修改 image地址  和 NFS 的IP地址 和 路径
[root@server-88-21 deploy]# cat deployment.yaml 
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nfs-client-provisioner
  labels:
    app: nfs-client-provisioner
  # replace with namespace where provisioner is deployed
  namespace: nfs-provisioner
spec:
  replicas: 1
  strategy:
    type: Recreate
  selector:
    matchLabels:
      app: nfs-client-provisioner
  template:
    metadata:
      labels:
        app: nfs-client-provisioner
    spec:
      serviceAccountName: nfs-client-provisioner
      containers:
        - name: nfs-client-provisioner
          image: registry.cn-shenzhen.aliyuncs.com/xiaohh-docker/nfs-subdir-external-provisioner:v4.0.2
          volumeMounts:
            - name: nfs-client-root
              mountPath: /persistentvolumes
          env:
            - name: PROVISIONER_NAME
              value: k8s-sigs.io/nfs-subdir-external-provisioner
            - name: NFS_SERVER
              value: 192.168.88.21
            - name: NFS_PATH
              value: /data/k8s
      volumes:
        - name: nfs-client-root
          nfs:
            server: 192.168.88.21
            path: /data/k8s

#如果你只打算安装动态供给的存储类,可以不进行默认存储配置,如果第一存储是 NFS 则需要进行默认存储配置
## 修改 nfs-client 为默认存储 见如下配置文件的:
  annotations:
    storageclass.beta.kubernetes.io/is-default-class: "true"

[root@server-88-21 deploy]# more class.yaml 
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: nfs-client
  annotations:
    storageclass.beta.kubernetes.io/is-default-class: "true"
provisioner: k8s-sigs.io/nfs-subdir-external-provisioner # or choose another name, must match deployment's env PROVISIONER_NAME'
parameters:
  archiveOnDelete: "false"
volumeBindingMode: WaitForFirstConsumer
reclaimPolicy: Retain
mountOptions:
  - hard
  - nointr
  - nosuid
  - rsize=512
  - wsize=512
  - timeo=600
  - retrans=3

执行下面这一段脚本我们可以看到还是有很多资源是存放在默认命名空间下:

yamls=$(grep -rl 'namespace: default' ./)
for yaml in ${yamls}; do
  echo ${yaml}
  cat ${yaml} | grep 'namespace: default'
done
 

我们可以新创建一个命名空间专门装这个驱动,也方便以后管理,所以我决定创建一个名为 nfs-provisioner 命名空间,为了方便就不用yaml文件了,直接通过命令创建:

kubectl create namespace nfs-provisioner

执行后可以看到这个命名空间创建成功:
[root@server-88-21 deploy]# kubectl  get namespace
NAME                              STATUS   AGE
default                           Active   27h
kube-flannel                      Active   26h
kube-node-lease                   Active   27h
kube-public                       Active   27h
kube-system                       Active   27h
kubesphere-controls-system        Active   21h
kubesphere-monitoring-federated   Active   21h
kubesphere-monitoring-system      Active   21h
kubesphere-system                 Active   110m
nfs-provisioner                   Active   21h

涉及命名空间这个配置的文件还挺多的,所以我们干脆通过一行脚本更改所有:

sed -i 's/namespace: default/namespace: nfs-provisioner/g' `grep -rl 'namespace: default' ./`

#####安装动态供给
之前我们已经修改好了所有的yaml资源清单文件,接下来我们直接执行安装。安装也是非常简单,直接通过下面一行命令就可以安装完成:

kubectl apply -k .

可以执行下面这个行命令查看是否部署完成:(检查status是否是Running)

kubectl get all -o wide -n nfs-provisioner
[root@server-88-21 deploy]# kubectl get all -o wide -n nfs-provisioner

NAME                                         READY   STATUS    RESTARTS      AGE    IP            NODE           NOMINATED NODE   READINESS GATES
pod/nfs-client-provisioner-94bcc8884-kstqp   1/1     Running   1 (95m ago)   124m   10.244.1.50   server-88-22   <none>           <none>

NAME                                     READY   UP-TO-DATE   AVAILABLE   AGE    CONTAINERS               IMAGES                                                                                   SELECTOR
deployment.apps/nfs-client-provisioner   1/1     1            1           124m   nfs-client-provisioner   registry.cn-shenzhen.aliyuncs.com/xiaohh-docker/nfs-subdir-external-provisioner:v4.0.2   app=nfs-client-provisioner

NAME                                               DESIRED   CURRENT   READY   AGE    CONTAINERS               IMAGES                                                                                   SELECTOR
replicaset.apps/nfs-client-provisioner-94bcc8884   1         1         1       124m   nfs-client-provisioner   registry.cn-shenzhen.aliyuncs.com/xiaohh-docker/nfs-subdir-external-provisioner:v4.0.2   app=nfs-client-provisioner,pod-template-hash=94bcc8884

##可以执行下面命令查询安装的动态供应存储类的名字:(NAME 下面的名称后一定要是 default 要不然ks 找不到这个sc存储)
[root@server-88-21 deploy]# kubectl get storageclass   或者用命令 # kubectl  get sc

NAME                   PROVISIONER                                   RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
nfs-client (default)   k8s-sigs.io/nfs-subdir-external-provisioner   Retain          WaitForFirstConsumer   false                  125m

##请记住存储的NAME为:   nfs-client
Nfs动态供应就已经安装完毕了

 

### ### ### ### ### ### ### ### ###  安装KubeSphere  ### ### ### ### ### ### ### ### 
 
下载KubeSphere的yaml资源清单文件
此次安装的是最新的 v3.4.0 的 KubeSphere,可以通过以下命令下载资源清单文件(共两个)(事实上下载的镜像是 image: kubesphere/ks-installer:v3.3.0):
wget \
https://github.com/kubesphere/ks-installer/releases/download/v3.4.0/kubesphere-installer.yaml \
https://github.com/kubesphere/ks-installer/releases/download/v3.4.0/cluster-configuration.yaml

其中这两个文件的作用:

kubesphere-installer.yaml: KubeSphere的安装器
cluster-configuration.yaml: KubeSphere的集群配置文件

#####################################################################################################
## 需要修改集群文件里面的 storageClass 为:nfs-client
 vi  cluster-configuration.yaml
修改第11如下:        storageClass: "nfs-client" 

如果需要开始devops 需要在78行 79行设置true 开启来
  78      devops:                  # (CPU: 0.47 Core, Memory: 8.6 G) Provide an out-of-the-box CI/CD system based on Jenkins, and automated workflow tools including Source-to-Image & Binary-to-Image.
  79        enabled: true             # Enable or disable the KubeSphere DevOps System.

####安装KubeSphere
然后我们先创建 kubesphere-installer.yaml 里面的资源:

kubectl apply -f kubesphere-installer.yaml   (该文件不需要修改直接用)

然后我们检查这个资源是否创建成功:(如果没有安装集群配置文件,则只会显示 ks-installer-c9655d997-5f4h4 状态为Running)

## 可以采用describe 命令来查看容器信息
kubectl describe pod  notification-manager-deployment-7dd45b5b7d-mqrl7   -n kubesphere-monitoring-system

## 可以删除这个pod节点
kubectl  delete pod ks-controller-manager-6d6b54464d-jkdjm -n kubesphere-system

kubectl  delete  pod  notification-manager-deployment-7dd45b5b7d-mqrl7  -n  kubesphere-monitoring-system

# kubectl get pod -o wide -n kubesphere-system
NAME                                     READY   STATUS    RESTARTS       AGE    IP            NODE           NOMINATED NODE   READINESS GATES
ks-apiserver-66cd784f8f-c9lgk            1/1     Running   0              122m   10.244.0.14   server-88-21   <none>           <none>
ks-console-5c5676fb55-jfcdd              1/1     Running   0              122m   10.244.0.13   server-88-21   <none>           <none>
ks-controller-manager-6d6b54464d-mrb59   1/1     Running   0              122m   10.244.0.15   server-88-21   <none>           <none>
ks-installer-c9655d997-5f4h4             1/1     Running   1 (107m ago)   125m   10.244.1.44   server-88-22   <none>           <none>

##接下来我们来执行 cluster-configuration.yaml 文件:
#kubectl apply -f cluster-configuration.yaml

它虽然只有一个资源,但是里面还是要做很多事的:
执行下面命令检查KubeSphere的执行日志:

kubectl logs -n kubesphere-system $(kubectl get pod -n kubesphere-system -l 'app in (ks-install, ks-installer)' -o jsonpath='{.items[0].metadata.name}') -f
需要等待几分钟,安装成功之后输出日志如下:
 **************************************************
Waiting for all tasks to be completed ...
task network status is successful  (1/4)
task openpitrix status is successful  (2/4)
task multicluster status is successful  (3/4)
task monitoring status is successful  (4/4)
**************************************************
Collecting installation results ...
#####################################################
###              Welcome to KubeSphere!           ###
#####################################################

Console: http://192.168.88.21:30880
Account: a
dmin
Password: P@88w0rd

NOTES:
  1. After you log into the console, please check the
     monitoring status of service components in
     "Cluster Management". If any service is not
     ready, please wait patiently until all components 
     are up and running.
  2. Please change the default password after login.

#####################################################
https://kubesphere.io             2024-02-21 11:51:46
#####################################################

##命令检查 查看所有的pod 所有的状态应该是Running 

[root@server-88-21 ~]# kubectl  get pod -A  -o wide
NAMESPACE                      NAME                                               READY   STATUS    RESTARTS       AGE
++++++++++++++ k8s的网络的NAMESPACE信息
kube-flannel                   kube-flannel-ds-nxgg7                              1/1     Running   3 (113m ago)   27h
kube-flannel                   kube-flannel-ds-rxmkj                              1/1     Running   2 (20h ago)    27h

++++++++++++++ k8s的NAMESPACE信息
kube-system                    coredns-6d8c4cb4d-6kj9t                            1/1     Running   1 (20h ago)    21h
kube-system                    coredns-6d8c4cb4d-wtdkh                            1/1     Running   1 (20h ago)    21h
kube-system                    etcd-server-88-21                                  1/1     Running   2 (20h ago)    27h
kube-system                    kube-apiserver-server-88-21                        1/1     Running   2 (20h ago)    27h
kube-system                    kube-controller-manager-server-88-21               1/1     Running   2 (20h ago)    27h
kube-system                    kube-proxy-hwh2c                                   1/1     Running   2 (20h ago)    27h
kube-system                    kube-proxy-pm6sp                                   1/1     Running   3 (113m ago)   27h
kube-system                    kube-scheduler-server-88-21                        1/1     Running   2 (20h ago)    27h
kube-system                    snapshot-controller-0                              1/1     Running   1 (113m ago)   20h

++++++++++++++kubesphere的NAMESPACE信息
kubesphere-controls-system     default-http-backend-696d6bf54f-5hxf2              1/1     Running   1 (113m ago)   20h
kubesphere-controls-system     kubectl-admin-b49cf5585-n6hzd                      1/1     Running   1 (113m ago)   124m
kubesphere-monitoring-system   alertmanager-main-0                                2/2     Running   2 (113m ago)   20h
kubesphere-monitoring-system   kube-state-metrics-645c64569c-2tflp                3/3     Running   6 (113m ago)   20h
kubesphere-monitoring-system   node-exporter-cmlfk                                2/2     Running   4 (20h ago)    21h
kubesphere-monitoring-system   node-exporter-rzhts                                2/2     Running   5 (113m ago)   21h
kubesphere-monitoring-system   notification-manager-deployment-7dd45b5b7d-fdt28   2/2     Running   2 (113m ago)   20h
kubesphere-monitoring-system   notification-manager-operator-8598775b-8vnbw       2/2     Running   2 (113m ago)   20h
kubesphere-monitoring-system   prometheus-k8s-0                                   2/2     Running   2 (113m ago)   125m
kubesphere-monitoring-system   prometheus-operator-57c78bd7fb-68qnq               2/2     Running   2 (113m ago)   20h
kubesphere-system              ks-apiserver-66cd784f8f-c9lgk                      1/1     Running   0              128m
kubesphere-system              ks-console-5c5676fb55-jfcdd                        1/1     Running   0              128m
kubesphere-system              ks-controller-manager-6d6b54464d-mrb59             1/1     Running   0              128m
kubesphere-system              ks-installer-c9655d997-5f4h4                       1/1     Running   1 (113m ago)   131m

+++++++++++++++ NFS动态存储NAMESPACE信息
nfs-provisioner                nfs-client-provisioner-94bcc8884-kstqp             1/1     Running   1 (113m ago)   142m

##命令检查 查看所有的存储                     
[root@server-88-21 ~]# kubectl  get sc -A
NAME                   PROVISIONER                                   RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
nfs-client (default)   k8s-sigs.io/nfs-subdir-external-provisioner   Retain          WaitForFirstConsumer   false                  143m

## 查看命名空间
[root@server-88-21 k8s]# kubectl  get namespace
NAME                              STATUS   AGE
default                           Active   28h
kube-flannel                      Active   28h
kube-node-lease                   Active   28h
kube-public                       Active   28h
kube-system                       Active   28h
kubesphere-controls-system        Active   22h
kubesphere-monitoring-federated   Active   22h
kubesphere-monitoring-system      Active   22h
kubesphere-system                 Active   3h10m
nfs-provisioner                   Active   23h
test-project                      Active   34s

 
### 访问地址: http://192.168.88.22:30880/dashboard   (用谷歌浏览器登入可能没有反应,可以使用360浏览器来访问)
 默认的用户名/密码是 admin/P@88w0rd   修改的密码为 Whlxhc__2020

#################################################   开启devops 功能-   #################################################
##
在kubesphere 集群管理--> 定制资源管理-->ClusterConfiguration -->ks-install -->编辑YAML文件 修改 enabled:false 为true ,然后k8s 会自动安装devops 的组件
  devops:
    enabled: true
    jenkinsJavaOpts_MaxRAM: 2g
    jenkinsJavaOpts_Xms: 1200m
    jenkinsJavaOpts_Xmx: 1600m
    jenkinsMemoryLim: 2Gi
    jenkinsMemoryReq: 1500Mi
    jenkinsVolumeSize: 8Gi

###执行下面命令检查KubeSphere的执行日志:

##kubectl logs -n kubesphere-system $(kubectl get pod -n kubesphere-system -l 'app in (ks-install, ks-installer)' -o jsonpath='{.items[0].metadata.name}') -f

localhost                  : ok=26   changed=15   unreachable=0    failed=0    skipped=12   rescued=0    ignored=0   
Start installing monitoring
Start installing multicluster
Start installing openpitrix
Start installing network
Start installing devops    #开始安装 devops 的功能
**************************************************
Waiting for all tasks to be completed ...
task openpitrix status is successful  (1/5)
task multicluster status is successful  (2/5)
task network status is successful  (3/5)
task monitoring status is successful  (4/5)

###[root@server-88-22 ~]# kubectl  get pod -A    #查看空间的启动状态

Helm 版本 支持的 Kubernetes 版本

3.8.x 1.23.x - 1.20.x

3.7.x 1.22.x - 1.19.x

3.6.x 1.21.x - 1.18.x

### 报错解决方法:

# 1、修改sc为默认标识
 kubectl patch sc local -p '{"metadata": {"annotations": {"storageclass.beta.kubernetes.io/is-default-class": "true"}}}'
其中local为我的sc名称

或者在创建class时添加注解:
  metadata:
    annotations:
      storageclass.beta.kubernetes.io/is-default-class: "true"
 

##报错
Error from server (InternalError): Internal error occurred: failed calling webhook \“users.iam.kubes
 k8s1.26 安装kubesphere3.4.1 多次安装卸载后,报错
failed: [localhost] (item={'ns': 'kubesphere-system', 'kind': 'users.iam.kubesphere.io', 'resource': 'admin', 'release': 'ks-core'}) => {"ansible_loop_var": "item", "changed": true, "cmd": "/usr/local/bin/kubectl -n kubesphere-system annotate --overwrite users.iam.kubesphere.io admin meta.helm.sh/release-name=ks-core && /usr/local/bin/kubectl -n kubesphere-system annotate --overwrite users.iam.kubesphere.io admin meta.helm.sh/release-namespace=kubesphere-system && /usr/local/bin/kubectl -n kubesphere-system label --overwrite users.iam.kubesphere.io admin app.kubernetes.io/managed-by=Helm\n", "delta": "0:00:00.440257", "end": "2023-12-21 13:46:30.328877", "failed_when_result": true, "item": {"kind": "users.iam.kubesphere.io", "ns": "kubesphere-system", "release": "ks-core", "resource": "admin"}, "msg": "non-zero return code", "rc": 1, "start": "2023-12-21 13:46:29.888620", "stderr": "Error from server (InternalError): Internal error occurred: failed calling webhook \"users.iam.kubesphere.io\": failed to call webhook: Post \"https://ks-controller-manager.kubesphere-system.svc:443/validate-email-iam-kubesphere-io-v1alpha2?timeout=30s\": service \"ks-controller-manager\" not found", "stderr_lines": ["Error from server (InternalError): Internal error occurred: failed calling webhook \"users.iam.kubesphere.io\": failed to call webhook: Post \"https://ks-controller-manager.kubesphere-system.svc:443/validate-email-iam-kubesphere-io-v1alpha2?timeout=30s\": service \"ks-controller-manager\" not found"], "stdout": "", "stdout_lines": []}

####删除提示的crd资源
kubectl get validatingwebhookconfigurations

NAME                                          WEBHOOKS   AGE
cluster.kubesphere.io                         1          5m17s
network.kubesphere.io                         1          5m17s
resourcesquotas.quota.kubesphere.io           1          5m17s
rulegroups.alerting.kubesphere.io             3          5m17s
storageclass-accessor.storage.kubesphere.io   1          5m17s
users.iam.kubesphere.io                       1          5m17s

###
kubectl delete validatingwebhookconfigurations `kubectl get validatingwebhookconfigurations|awk '{print $1}'`
###重新执行安装kubesphere3 即可
 
 
## ks-apiserver  一直处于  ContainerCreating 状态
kubesphere-system              ks-apiserver                   0/1     ContainerCreating

 docker对systemd的版本要求,在centos7环境下,systemd好像是219版本,我出问题的环境也是219版本
可以重启一下服务 不要先重启kubelet,先重启kubelet会导致所有的pod处于pending状态!
 systemctl restart docker
 systemctl restart kubelet

#### 删除namespace rook-ceph 提示 正在回退中 无法删除的解决方法:
先开启 kubectl proxy 代理模式
然后再开一个ssh窗口执行:
 

 kubectl get ns rook-ceph -o json > rook-ceph.yaml

编辑文件rook-ceph.yaml 将三行删除掉 如图剪头所示 删除之后的内容 

finalizers:[
   "kubernetes"
]

curl -k -H "Content-Type: application/json" -X PUT --data-binary @rook-ceph.yaml http://127.0.0.1:8001/api/v1/namespaces/rook-ceph/finalize