参考:k8s笔记15(Ceph)_ATCtoK8s 空管智能运维的技术博客_51CTO博客

CRUSH算法通过计算数据存储位置来确定如何存储和检索数据。CRUSH使Ceph客户端能够直接与OSD通信,而不是通过集中式服务器或代理。通过算法确定的存储和检索数据的方法,Ceph避免了单点故障、一致性瓶颈和可扩展性的物理限制(从CRUSH层次结构中删除OSD:ceph osd crush rm osd.<id>)

当OSD down时,其内容可能落后于放置组中其他副本的当前状态。当OSD返回up时,必须更新放置组的内容以反映当前状态。在此期间,OSD可能反映recovering状态。mon_osd_down_out_interval选项设置为零,这意味着系统在OSD发生故障后不会自动执行任何修复或修复操作。相反,需要手动将OSD标记为“out”(即通过ceph osd out <osd-id>)以触发恢复。

OSD的状态要么在集群中(in),要么在集群外(out);并且,它要么已启动并运行(up),要么已关闭且未运行(down)。如果OSD是up,它可能是在集群中的(您可以读取和写入数据),也可能是集群的out。

新的Ceph集群要有:Ceph configuration file和monitor keyring

cluster.yaml的参数removeOSDsIfOutAndSafeToRemove: If true the operator will remove the OSDs that are down and whose data has been restored to other OSDs. 此功能仅在useAllNodes已设置为false时可用,Nodes can be added and removed over time by updating the Cluster CRD, for example with kubectl -n rook-ceph edit cephcluster rook-ceph

v1.6.3开始升级

  •  common resources and CRDsAutomatically updated if you are upgrading via the helm chart

问题:

rook-ceph-operator-76948f86f7-44mhx                    0/1     CrashLoopBackOff
# kubectl -n rook-ceph describe pod rook-ceph-operator-76948f86f7-44mhx
failed to run operator
: failed to run the controller-runtime manager: no matches for kind "PodDisruptionBudget" in version "policy/v1beta1"
  • Rook 可自动部署和管理 Ceph,以提供自我管理、自我扩展和自我修复的存储服务。Rook  operator 通过基于 Kubernetes  resources deploy, configure, provision, scale, upgrade, and monitor Ceph 实现这一目标。
  • 将ceph-mgr daemons放置在与mon(维护集群状态映射的守护进程。这个“集群状态”包括监视器映射、管理器映射、OSD映射和CRUSH映射。Ceph集群必须包含至少三个正在运行的监视器,才能既冗余又高度可用。)相同的节点上不是强制性的,但它几乎总是明智的。Ceph Dashboard是一个内置的基于Web的Ceph管理和监控应用程序,您可以通过它检查和管理集群中的各种资源。它被实现为Ceph Manager Daemon模块。
服务(Service):rook-ceph-mgr-dashboard更改为NodePort会“自动维护”替换回ClusterIP,所以要新建一个rook-ceph-mgr-dashboard-np

1、Rook Upgrades 指南侧重于更新management layer的Rook版本,而 Ceph upgrade指南侧重于更新数据层。

  • 运行v1.13.7的实时Rook集群升级到版本v1.14.0
  •  Rook common resources. This includes modified privileges (RBAC) needed by the Operator.

2、Upgrade 1.6 to 1.7 当Rook v1.7.11发布时,从v1.7.0更新的过程

git clone --single-branch --depth=1 --branch v1.7.11 https://github.com/rook/rook.git
cd rook/cluster/examples/kubernetes/ceph
kubectl apply -f common.yaml -f crds.yaml
kubectl -n rook-ceph set image deploy/rook-ceph-operator rook-ceph-operator=rook/ceph:v1.7.11
  • 升级后rook-ceph-operator正常了

3、Upgrade 1.7 to 1.8

git clone --single-branch --depth=1 --branch v1.8.10 https://github.com/rook/rook.git
cd rook/deploy/examples
kubectl apply -f common.yaml -f crds.yaml
 kubectl -n rook-ceph set image deploy/rook-ceph-operator rook-ceph-operator=registry.cn-beijing.aliyuncs.com/mizy/ceph:v1.8.10
  • Rook v1.8不再支持Ceph Nautilus(14.2. x)。Nautilus用户必须将Ceph升级到Octopus(15.2.x)或Pacific(16.2.x),然后才能升级到Rook v1.8。
# ceph -v
  ceph version 16.2.2 (e8f22dde28889481f4dda2beb8a07788204821d3) pacific (stable)
# kubectl -n rook-ceph get deployments rook-ceph-osd-9 -oyaml |grep ceph-version
  ceph-version: 15.2.11-0

4、Upgrade 1.8 to 1.9

4.1、Rook Upgrades

Rook v1.9 supports the following Ceph versions:

  • Ceph Quincy v17.2.0 or newer
  • Ceph Pacific v16.2.0 or newer
  • Ceph Octopus v15.2.0 or newer

!!!Rook v1.10 is planning to drop support for Ceph Octopus (15.2.x), so please consider upgrading your Ceph cluster.We recommend updating to v16.2.7 or newer. If you require updating to v16.2.0-v16.2.6, please see the v1.8 upgrade guide for a special upgrade consideration.Ceph Docs (rook.github.io)

1)deploy/examples/cluster.yaml
1.1
registry.cn-beijing.aliyuncs.com/mizy/ceph:v16.2.10
1.2
 skipUpgradeChecks: true
1.3
 ssl: false
1.4
    useAllNodes: false
    useAllDevices: false 
1.5 
     count: 2
1.6 
注释掉k8s-node03
    nodes:
    - devices:
      - name: "sdb"
      name: "k8s-node01"
    - devices:
      - name: "sdb"
      name: "k8s-node02"
    # - devices:
      # - name: "sdb"
      # name: "k8s-node03"
    - devices:
      - name: "sdb"
      - name: "sdc"
      name: "k8s-node04"
    - devices:
      - name: "sdb"
      - name: "sdc"
      name: "k8s-node05"
    - devices:
      - name: "sdb"
      name: "k8s-node06"
    - devices:
      - name: "sdb"
      name: "k8s-node07"
    - devices:
      - name: "sdb"
      name: "k8s-node08" 
2)rook/deploy/examples/operator.yaml
2.1
  ROOK_CSI_CEPH_IMAGE: "registry.cn-beijing.aliyuncs.com/mizy/cephcsi:v3.6.2"
  ROOK_CSI_REGISTRAR_IMAGE: "registry.cn-beijing.aliyuncs.com/mizy/csi-node-driver-registrar:v2.5.1
  ROOK_CSI_PROVISIONER_IMAGE: "registry.cn-beijing.aliyuncs.com/mizy/csi-provisioner:v3.1.0"
  ROOK_CSI_SNAPSHOTTER_IMAGE: "registry.cn-beijing.aliyuncs.com/mizy/csi-snapshotter:v6.0.1"
  ROOK_CSI_ATTACHER_IMAGE: "registry.cn-beijing.aliyuncs.com/mizy/csi-attacher:v3.4.0"  
2.2
  image: registry.cn-beijing.aliyuncs.com/mizy/ceph:v1.9.13
2.3  
  ROOK_ENABLE_DISCOVERY_DAEMON: "true"
git clone --single-branch --depth=1 --branch v1.9.13 https://github.com/rook/rook.git
cd rook/deploy/examples
kubectl apply -f common.yaml -f crds.yaml
kubectl -n rook-ceph set image deploy/rook-ceph-operator rook-ceph-operator=registry.cn-beijing.aliyuncs.com/mizy/ceph:v1.9.13
监测升级过程:
# watch --exec kubectl -n rook-ceph get deployments -l rook_cluster=rook-ceph -o jsonpath='{range .items[*]}{.metadata.name}{"  \treq/upd/avl: "}{.spec.replicas}{"/"}{.status.updatedReplicas}{"/"}{.status.readyReplicas}{"  \tceph-version="}{.metadata.labels.ceph-version}{"\n"}{end}'

升级过程中监测命令:(命令里有两处是rook-version而不是4.2小节针对Ceph的ceph-version)

k get cephcluster -n rook-ceph 
watch --exec kubectl -n rook-ceph get deployments -l rook_cluster=rook-ceph -o jsonpath='{range .items[*]}{.metadata.name}{"  \treq/upd/avl: "}{.spec.replicas}{"/"}{.status.updatedReplicas}{"/"}{.status.readyReplicas}{"  \trook-version="}{.metadata.labels.rook-version}{"\n"}{end}'
kubectl -n rook-ceph get deployment -l rook_cluster=rook-ceph -o jsonpath='{range .items[*]}{"rook-version="}{.metadata.labels.rook-version}{"\n"}{end}' | sort | uniq

4.2、Ceph Upgrades

NEW_CEPH_IMAGE='registry.cn-beijing.aliyuncs.com/mizy/ceph:v16.2.7-20220216'
kubectl -n rook-ceph patch CephCluster rook-ceph --type=merge -p "{\"spec\": {\"cephVersion\": {\"image\": \"$NEW_CEPH_IMAGE\"}}}" 
watch --exec kubectl -n rook-ceph get deployments -l rook_cluster=rook-ceph -o jsonpath='{range .items[*]}{.metadata.name}{"  \treq/upd/avl: "}{.spec.replicas}{"/"}{.status.updatedReplicas}{"/"}{.status.readyReplicas}{"  \tceph-version="}{.metadata.labels.ceph-version}{"\n"}{end}' 
kubectl -n rook-ceph get deployment -l rook_cluster=rook-ceph -o jsonpath='{range .items[*]}{"ceph-version="}{.metadata.labels.ceph-version}{"\n"}{end}' | sort | uniq

5、Upgrade 1.9 to 1.10

(Before upgrading to K8s 1.25, ensure that you are running at least Rook v1.9.10, or v1.10.x)

  • Support for Ceph Octopus (15.2.x) was removed. If you are running v15 you must upgrade to Ceph Pacific (v16) or Quincy (v17) before upgrading to Rook v1.10
  • The minimum supported version of Ceph-CSI is v3.6.0. You must update to at least this version of Ceph-CSI before or at the same time you update the Rook operator image to v1.10
  • !!!Before upgrading to K8s 1.25, ensure that you are running at least Rook v1.9.10, or v1.10.x. If you upgrade to K8s 1.25 before upgrading to v1.9.10 or newer, the Helm chart may be blocked from upgrading to newer versions of Rook. See https://github.com/rook/rook/issues/10826 for a possible workaround.

6、Upgrade 1.10 to 1.11 --branch v1.11.11

6.1、升级rook-ceph-tools,先删除之前存在的deploy,然后再apply。(CSI Common Issues - Rook Ceph Documentation

kubectl -n rook-ceph delete deploy rook-ceph-tools
kubectl apply -f toolbox.yaml
  ceph status
  ceph osd tree
  ceph osd status
  ceph osd df
  ceph osd utilization

6.2、升级后状态查看命令:

kubectl -n rook-ceph get deployments -o jsonpath='{range .items[*]}{.metadata.name}{"  \treq/upd/avl: "}{.spec.replicas}{"/"}{.status.updatedReplicas}{"/"}{.status.readyReplicas}{"  \trook-version="}{.metadata.labels.rook-version}{"\n"}{end}'
kubectl -n rook-ceph get jobs -o jsonpath='{range .items[*]}{.metadata.name}{"  \tsucceeded: "}{.status.succeeded}{"      \trook-version="}{.metadata.labels.rook-version}{"\n"}{end}'
kubectl -n rook-ceph get pod -o jsonpath='{range .items[*]}{.metadata.name}{"\n\t"}{.status.phase}{"\t\t"}{.spec.containers[0].image}{"\t"}{.spec.initContainers[0]}{"\n"}{end}' && \
kubectl -n rook-ceph get pod -o jsonpath='{range .items[*]}{.metadata.name}{"\n\t"}{.status.phase}{"\t\t"}{.spec.containers[0].image}{"\t"}{.spec.initContainers[0].image}{"\n"}{end}'

6.3、问题:

  • 每当 OSD 容器重启时,Rook 都会调用ceph-bluestore-tool bluefs-bdev-expand。
Hence presumable we have multiple ceph-osd instances using the same bluefs.
I can see at least two issues here. Both are likely to be caused by the containerized environment:
1) multiple instances are started. OSD startup mechanics in containers to be verified.
2) BlueStore's protection mechanism to control mutually exclusive OSD data usage works improperly in this setup.

6.4、 Remove an OSD(Ceph Docs (rook.github.io)

  • Cluster CR (custom resource), typically called cluster.yaml
  • Update your CephCluster CR(应该是cluster.yaml such that the operator won’t create an OSD on the device anymore
ceph osd out osd.<ID>
ceph status //will indicate the backfilling is done when all of the PGs are active+clean.
ceph osd purge <ID> --yes-i-really-mean-it
ceph osd tree

6.5、问题:升级rook-ceph后,别的IMAGES都更新了,但是csi-cephfsplugin-provisioner和csi-rbdplugin-provisioner还是旧的(Ceph CSI利用了 Kubernetes 的 CSI 规范,实现了一个与 CO 平台无关的存储接口。通过独立的 RBD 和 CephFS 插件,可以对接 Ceph 集群,提供块设备(Block)和文件系统(Filesystem)两种模式的存储服务。支持多种功能,如 Provisioner、Attacher、Resizer 等,确保了 Kubernetes 应用的数据持久化需求)。原因是升级执行了2条命令:

kubectl apply -f common.yaml -f crds.yaml
kubectl -n rook-ceph set image deploy/rook-ceph-operator rook-ceph-operator=registry.cn-beijing.aliyuncs.com/mizy/ceph:v1.11.11
  • 但是operator.yaml里不止是deploy/rook-ceph-operator里修改了image,还有这里问题涉及的 ROOK_CSI_PROVISIONER_IMAGE等6个镜像地址参数,所以也需要kubectl apply -f operator.yaml(该镜像地址问题就正常了)
kubectl create -f operator.yaml
1、common.yaml:	是创建启动操作符和 Ceph 集群所必需的公共资源(ClusterRole,ClusterRoleBinding)。这些资源必须在 operator.yaml 和 cluster.yaml 或其变体之前创建。
2、crds.yaml:	创建 Rook 集群之前,需创建必要的 CRDs(kind: CustomResourceDefinition自定义资源定义)。这些资源必须在 cluster.yaml 或其变体创建之前完成创建。 
3、operator.yaml:
	1)、***ConfigMap(name: rook-ceph-operator-config):是关于“Rook Ceph Operator Config ConfigMap”(Rook-Ceph Operator configurations)的说明。
		可以使用此配置映射来覆盖 Rook-Ceph Operator的配置(主要含ROOK_CSI_RESIZER_IMAGE等镜像地址)。同时提到,如果相同的环境变量配置也存在于操作符部署中,此配置将具有优先级。 
	2)、Deployment(rook-ceph-operator):
  • 也需要执行kubectl apply -f cluster.yaml,因为cluster.yaml定义了 rook-ceph cluster的设置(kind: CephCluster),这些设置适用于生产集群的常见情况。包含镜像quay.io/ceph/ceph:v18.2.4、useAllNodes: true、useAllDevices: true以及storage resources:nodes等信息。

6.6、ceph Dashboard黄色显示: RECENT_CRASH: 12 daemons have recently crashed

bash-4.4$ ceph crash ls-new 也显示12条
bash-4.4$ ceph crash archive 2024-08-04T23:57:22.177205Z_42303cad-c9a7-41e2-a87d-6ca0efb17ea7  少一条
ceph crash archive-all   就没有该提示了

黄色提示:[WRN] TELEMETRY_CHANGED: Telemetry requires re-opt-in,执行以下命令OK。

bash-4.4$ ceph telemetry on  --license sharing-1-0
bash-4.4$ ceph telemetry enable channel perf

7、Upgrade 1.11 to 1.12     (v1.12.11)

  • CephCSI CephFS driver introduced a breaking change in v3.9.0. If any existing CephFS storageclass in the cluster has MountOptions parameter set, follow the steps mentioned in the CephCSI upgrade guide to ensure a smooth upgrade.

8、Upgrade 1.12 to 1.13(v1.13.10)

  •  The minimum supported version of Ceph is v17.2.0. If a lower version is currently deployed, Upgrade Ceph before upgrading Rook.
  • CephCSI CephFS driver introduced a breaking change in v3.9.0. If any existing CephFS storageclass in the cluster has MountOptions parameter set, follow the steps mentioned in the CephCSI upgrade guide to ensure a smooth upgrade. This became the default CSI version in Rook v1.12.1, and may have already been resolved.
  • !!!Support for the admission controller has been removed. CRD validation is now enabled with Validating Admission Policies. Validating Admission Policy rules are ignored in Kubernetes v1.24 and lower. If the admission controller is enabled, it is advised to upgrade to Kubernetes v1.25 or higher before upgrading Rook. For more info, see https://github.com/rook/rook/pull/11532.