初识Ceph之1——通过Cephadm安装部署

原创

qiao645 2023-07-09 11:19:52 博主文章分类：K8S ©著作权

©著作权归作者所有：来自51CTO博客作者qiao645的原创作品，请联系作者获取转载授权，否则将追究法律责任

一、背景介绍

本次实验使用5台CentOS7.9主机，Ceph版本为Octopus，采用3*monitor，3*mgr方式进行部署，整个部署过程记录较为详细，步骤按照Ceph官方文档进行。每台主机2张网卡，内网用于Ceph集群内部通信，网段为：10.0.0.0/24；外网网段为：172.16.100.0/24，用于对外提供服务，每台主机除系统盘外还有2块50G本地磁盘，作为Ceph容量盘，整个集群信息如下表所示：

主机	角色	IP
host1	mon，mgr，work，cephadm	172.16.100.31/10.0.0.1
host2	mon，mgr，work	172.16.100.32/10.0.0.2
host3	mon，mgr，work	172.16.100.33/10.0.0.3
host4	work	172.16.100.34/10.0.0.4
host5	work	172.16.100.35/10.0.0.5

二、ceph最低硬件要求

初识Ceph之1——通过Cephadm安装部署_ceph

三、安装部署

1.安装前每台物理节点准备

根据Ceph要求，每台节点需要满足以下条件：

Systemd
Podman or Docker for running containers
Time synchronization (such as chrony or NTP)
LVM2 for provisioning storage devices

配置docker加速器，以提高docker镜像仓库响应速度

mkdir -p /etc/docker
cat > /etc/docker/daemon.json <<EOF
{
  "registry-mirrors": ["https://9lqbw5u6.mirror.aliyuncs.com"]
}
EOF
systemctl daemon-reload
systemctl restart docker

按照要求部署要求组件，并关闭selinux与iptables，保证每台物理界点/etc/hosts文件一致，此处步骤省略，同时每台主机安装Ceph的yum源

[root@ceph1 yum.repos.d]# cat ceph.repo
[Ceph]
name=Ceph
baseurl=https://mirrors.aliyun.com/ceph/rpm-15.2.17/el7/x86_64/
enabled=1
gpgcheck=0

[Ceph-noarch]
name=Ceph noarch packages
baseurl=https://mirrors.aliyun.com/ceph/rpm-15.2.17/el7/noarch/
enabled=1
gpgcheck=0

2.初始化一个新的集群

由于cephadm软件是通过ssh与集群内其他软件进行通讯，所以只需要在集群中第一台主机安装cephadm软件

yum install -y cephadm

安装集群内第一台监控检点

mkdir -p /etc/ceph
cephadm bootstrap --mon-ip 10.0.0.1

上述命令执行过程中，产生以下效果：

Create a monitor and manager daemon for the new cluster on the local host.
Generate a new SSH key for the Ceph cluster and adds it to the root user’s /root/.ssh/authorized_keys file.
Write a minimal configuration file needed to communicate with the new cluster to /etc/ceph/ceph.conf.
Write a copy of the client.admin administrative (privileged!) secret key to /etc/ceph/ceph.client.admin.keyring.
Write a copy of the public key to /etc/ceph/ceph.pub.

默认初始化参数可以满足大多数场景，额外参数可参考cephadm bootstrap -h，初始化过程信息如下：

[root@ceph1 ~]# mkdir -p /etc/ceph
[root@ceph1 ~]# cephadm bootstrap --mon-ip 10.0.0.1
Verifying podman|docker is present...
Verifying lvm2 is present...
Verifying time synchronization is in place...
Unit chronyd.service is enabled and running
Repeating the final host check...
podman|docker (/usr/bin/docker) is present
systemctl is present
lvcreate is present
Unit chronyd.service is enabled and running
Host looks OK
Cluster fsid: 7629fef6-1943-11ee-9b2d-000c29dbfa5e
Verifying IP 10.0.0.1 port 3300 ...
Verifying IP 10.0.0.1 port 6789 ...
Mon IP 10.0.0.1 is in CIDR network 10.0.0.0/24
Pulling container image quay.io/ceph/ceph:v15...
Extracting ceph user uid/gid from container image...
Creating initial keys...
Creating initial monmap...
Creating mon...
Waiting for mon to start...
Waiting for mon...
mon is available
Assimilating anything we can from ceph.conf...
Generating new minimal ceph.conf...
Restarting the monitor...
Setting mon public_network...
Creating mgr...
Verifying port 9283 ...
Wrote keyring to /etc/ceph/ceph.client.admin.keyring
Wrote config to /etc/ceph/ceph.conf
Waiting for mgr to start...
Waiting for mgr...
mgr not available, waiting (1/10)...
mgr not available, waiting (2/10)...
mgr not available, waiting (3/10)...
mgr not available, waiting (4/10)...
mgr is available
Enabling cephadm module...
Waiting for the mgr to restart...
Waiting for Mgr epoch 5...
Mgr epoch 5 is available
Setting orchestrator backend to cephadm...
Generating ssh key...
Wrote public SSH key to to /etc/ceph/ceph.pub
Adding key to root@localhost's authorized_keys...
Adding host ceph1...
Deploying mon service with default placement...
Deploying mgr service with default placement...
Deploying crash service with default placement...
Enabling mgr prometheus module...
Deploying prometheus service with default placement...
Deploying grafana service with default placement...
Deploying node-exporter service with default placement...
Deploying alertmanager service with default placement...
Enabling the dashboard module...
Waiting for the mgr to restart...
Waiting for Mgr epoch 13...
Mgr epoch 13 is available
Generating a dashboard self-signed certificate...
Creating initial admin user...
Fetching dashboard port number...
Ceph Dashboard is now available at:

             URL: https://ceph1:8443/
            User: admin
        Password: n739xvn56e

You can access the Ceph CLI with:

        sudo /usr/sbin/cephadm shell --fsid 7629fef6-1943-11ee-9b2d-000c29dbfa5e -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring

Please consider enabling telemetry to help improve Ceph:

        ceph telemetry on

For more information see:

        https://docs.ceph.com/docs/master/mgr/telemetry/

Bootstrap complete.

初始化完成后，可以看到集群中生成的配置文件和已拉起的容器：

ceph-mgr ceph管理程序
ceph-monitor ceph监视器
ceph-crash 崩溃数据收集模块
prometheus prometheus监控组件
grafana 监控数据展示dashboard
alertmanager prometheus告警组件
node_exporter prometheus节点数据收集组件

[root@ceph1 ~]# ll .ssh/
total 8
-rw------- 1 root root 595 Jul  3 09:47 authorized_keys
-rw-r--r-- 1 root root 704 Jul  1 23:45 known_hosts
[root@ceph1 ~]#
[root@ceph1 ~]#
[root@ceph1 ~]# ll /etc/ceph/
total 12
-rw------- 1 root root  63 Jul  3 09:47 ceph.client.admin.keyring
-rw-r--r-- 1 root root 167 Jul  3 09:47 ceph.conf
-rw-r--r-- 1 root root 595 Jul  3 09:47 ceph.pub
[root@ceph1 ~]#
[root@ceph1 ~]#
[root@ceph1 ~]# cat /etc/ceph/ceph.conf
# minimal ceph.conf for 7629fef6-1943-11ee-9b2d-000c29dbfa5e
[global]
        fsid = 7629fef6-1943-11ee-9b2d-000c29dbfa5e
        mon_host = [v2:10.0.0.1:3300/0,v1:10.0.0.1:6789/0]
[root@ceph1 ~]#
[root@ceph1 ~]#
[root@ceph1 ~]# docker ps
CONTAINER ID   IMAGE                                      COMMAND                  CREATED              STATUS              PORTS     NAMES
3601c100e5af   quay.io/ceph/ceph-grafana:6.7.4            "/bin/sh -c 'grafana…"   About a minute ago   Up About a minute             ceph-7629fef6-1943-11ee-9b2d-000c29dbfa5e-grafana.ceph1
a06674463341   quay.io/prometheus/alertmanager:v0.20.0    "/bin/alertmanager -…"   About a minute ago   Up About a minute             ceph-7629fef6-1943-11ee-9b2d-000c29dbfa5e-alertmanager.ceph1
b12b094b3aa7   quay.io/prometheus/prometheus:v2.18.1      "/bin/prometheus --c…"   About a minute ago   Up About a minute             ceph-7629fef6-1943-11ee-9b2d-000c29dbfa5e-prometheus.ceph1
72a29ed493f2   quay.io/prometheus/node-exporter:v0.18.1   "/bin/node_exporter …"   2 minutes ago        Up 2 minutes                  ceph-7629fef6-1943-11ee-9b2d-000c29dbfa5e-node-exporter.ceph1
ab0333624ecf   quay.io/ceph/ceph:v15                      "/usr/bin/ceph-crash…"   3 minutes ago        Up 3 minutes                  ceph-7629fef6-1943-11ee-9b2d-000c29dbfa5e-crash.ceph1
39c4584ad379   quay.io/ceph/ceph:v15                      "/usr/bin/ceph-mgr -…"   5 minutes ago        Up 5 minutes                  ceph-7629fef6-1943-11ee-9b2d-000c29dbfa5e-mgr.ceph1.xxracu
50fbda18cd48   quay.io/ceph/ceph:v15                      "/usr/bin/ceph-mon -…"   5 minutes ago        Up 5 minutes                  ceph-7629fef6-1943-11ee-9b2d-000c29dbfa5e-mon.ceph1

3.启用Ceph CLI命令工具

由于ceph1只安装了cephadm，无法直接通过ceph命令进行查看集群信息，可通过cephadm命令查看

[root@ceph1 ~]# cephadm shell
Inferring fsid 7629fef6-1943-11ee-9b2d-000c29dbfa5e
Inferring config /var/lib/ceph/7629fef6-1943-11ee-9b2d-000c29dbfa5e/mon.ceph1/config
Using recent ceph image quay.io/ceph/ceph@sha256:c08064dde4bba4e72a1f55d90ca32df9ef5aafab82efe2e0a0722444a5aaacca
[ceph: root@ceph1 /]#
[ceph: root@ceph1 /]# ceph -s
  cluster:
    id:     7629fef6-1943-11ee-9b2d-000c29dbfa5e
    health: HEALTH_WARN
            OSD count 0 < osd_pool_default_size 3

  services:
    mon: 1 daemons, quorum ceph1 (age 18m)
    mgr: ceph1.xxracu(active, since 17m)
    osd: 0 osds: 0 up, 0 in

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 B
    usage:   0 B used, 0 B / 0 B avail
    pgs:

或者使用非交互式模式查看

[root@ceph1 ~]# cephadm shell -- ceph -s
Inferring fsid 7629fef6-1943-11ee-9b2d-000c29dbfa5e
Inferring config /var/lib/ceph/7629fef6-1943-11ee-9b2d-000c29dbfa5e/mon.ceph1/config
Using recent ceph image quay.io/ceph/ceph@sha256:c08064dde4bba4e72a1f55d90ca32df9ef5aafab82efe2e0a0722444a5aaacca
  cluster:
    id:     7629fef6-1943-11ee-9b2d-000c29dbfa5e
    health: HEALTH_WARN
            OSD count 0 < osd_pool_default_size 3

  services:
    mon: 1 daemons, quorum ceph1 (age 19m)
    mgr: ceph1.xxracu(active, since 18m)
    osd: 0 osds: 0 up, 0 in

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 B
    usage:   0 B used, 0 B / 0 B avail
    pgs:

官方更推荐安装ceph-common包，它包含了所有的ceph命令

[root@ceph1 ~]# yum install -y ceph-common.x86_64

安装完成后可以通过ceph命令测试查看组件状态

[root@ceph1 ~]# ceph orch ps
NAME                 HOST   STATUS         REFRESHED  AGE  VERSION  IMAGE NAME                                IMAGE ID      CONTAINER ID
alertmanager.ceph1   ceph1  running (19m)  9m ago     21m  0.20.0   quay.io/prometheus/alertmanager:v0.20.0   0881eb8f169f  a06674463341
crash.ceph1          ceph1  running (21m)  9m ago     21m  15.2.17  quay.io/ceph/ceph:v15                     93146564743f  ab0333624ecf
grafana.ceph1        ceph1  running (19m)  9m ago     20m  6.7.4    quay.io/ceph/ceph-grafana:6.7.4           557c83e11646  3601c100e5af
mgr.ceph1.xxracu     ceph1  running (23m)  9m ago     23m  15.2.17  quay.io/ceph/ceph:v15                     93146564743f  39c4584ad379
mon.ceph1            ceph1  running (23m)  9m ago     23m  15.2.17  quay.io/ceph/ceph:v15                     93146564743f  50fbda18cd48
node-exporter.ceph1  ceph1  running (19m)  9m ago     20m  0.18.1   quay.io/prometheus/node-exporter:v0.18.1  e5a616e4b9cf  72a29ed493f2
prometheus.ceph1     ceph1  running (19m)  9m ago     19m  2.18.1   quay.io/prometheus/prometheus:v2.18.1     de242295e225  b12b094b3aa7
[root@ceph1 ~]#
[root@ceph1 ~]# ceph orch ps --daemon-type mon
NAME       HOST   STATUS         REFRESHED  AGE  VERSION  IMAGE NAME             IMAGE ID      CONTAINER ID
mon.ceph1  ceph1  running (23m)  9m ago     23m  15.2.17  quay.io/ceph/ceph:v15  93146564743f  50fbda18cd48

根据初始化信息，通过浏览器访问dashboard

初识Ceph之1——通过Cephadm安装部署_ceph_02

4.其他节点加入集群并调整角色数量

将集群的SSH公钥安装到新加主机的root用户authorized_keys文件中，此处仅以ceph2为例

[root@ceph1 ~]# ssh-copy-id -f -i /etc/ceph/ceph.pub root@ceph2
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/etc/ceph/ceph.pub"
root@ceph2's password:

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'root@ceph2'"
and check to make sure that only the key(s) you wanted were added.

[root@ceph1 ~]#
[root@ceph1 ~]# ssh-copy-id -f -i /etc/ceph/ceph.pub root@ceph3
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/etc/ceph/ceph.pub"
root@ceph3's password:

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'root@ceph3'"
and check to make sure that only the key(s) you wanted were added.

告诉 Ceph 集群有新节点加入，需要在其他节点上安装python3，否则会出现如下报错:

[root@ceph1 ~]# ceph orch host add ceph2
Error EINVAL: Can't communicate with remote host `ceph2`, possibly because python3 is not installed there: cannot send (already closed?)
[root@ceph1 ~]#
[root@ceph1 ~]#
[root@ceph1 ~]# ceph orch host add ceph2
Added host 'ceph2'

5台都加入集群后，会发现mon节点自动扩展为5节点，mgr节点扩展为2节点

初识Ceph之1——通过Cephadm安装部署_ceph_03

如果你想调整默认监控节点数量，当前为5个，调整为3个，可以采用以下命令：

[root@ceph1 ~]# ceph -s
  cluster:
    id:     7629fef6-1943-11ee-9b2d-000c29dbfa5e
    health: HEALTH_WARN
            OSD count 0 < osd_pool_default_size 3

  services:
    mon: 5 daemons, quorum ceph1,ceph3,ceph5,ceph4,ceph2 (age 17m)
    mgr: ceph1.xxracu(active, since 4h), standbys: ceph2.zhftxe
    osd: 0 osds: 0 up, 0 in

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 B
    usage:   0 B used, 0 B / 0 B avail
    pgs:

[root@ceph1 ~]#
[root@ceph1 ~]# ceph orch apply mon ceph1,ceph2,ceph3
Scheduled mon update...
[root@ceph1 ~]# ceph orch apply mgr ceph1,ceph2,ceph3
Scheduled mon update...

[root@ceph1 ~]# ceph -s
  cluster:
    id:     7629fef6-1943-11ee-9b2d-000c29dbfa5e
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum ceph1,ceph3,ceph2 (age 39m)
    mgr: ceph1.xxracu(active, since 5h), standbys: ceph2.zhftxe, ceph3.wcysbm
    osd: 3 osds: 3 up (since 9m), 3 in (since 9m)

  data:
    pools:   1 pools, 1 pgs
    objects: 0 objects, 0 B
    usage:   3.0 GiB used, 147 GiB / 150 GiB avail
    pgs:     1 active+clean

5.部署OSDS

查看集群主机上的存储设备清单：

[root@ceph1 ~]# ceph orch device ls
Hostname  Path      Type  Serial  Size   Health   Ident  Fault  Available
ceph1     /dev/sdb  hdd           53.6G  Unknown  N/A    N/A    Yes
ceph1     /dev/sdc  hdd           53.6G  Unknown  N/A    N/A    Yes
ceph2     /dev/sdb  hdd           53.6G  Unknown  N/A    N/A    Yes
ceph2     /dev/sdc  hdd           53.6G  Unknown  N/A    N/A    Yes
ceph3     /dev/sdb  hdd           53.6G  Unknown  N/A    N/A    Yes
ceph3     /dev/sdc  hdd           53.6G  Unknown  N/A    N/A    Yes
ceph4     /dev/sdb  hdd           53.6G  Unknown  N/A    N/A    Yes
ceph4     /dev/sdc  hdd           53.6G  Unknown  N/A    N/A    Yes
ceph5     /dev/sdb  hdd           53.6G  Unknown  N/A    N/A    Yes
ceph5     /dev/sdc  hdd           53.6G  Unknown  N/A    N/A    Yes
ceph5     /dev/fd0  hdd           4096   Unknown  N/A    N/A    No

要成为ceph的OSD设备必须满足以下条件：

The device must have no partitions.
The device must not have any LVM state.
The device must not be mounted.
The device must not contain a file system.
The device must not contain a Ceph BlueStore OSD.
The device must be larger than 5 GB.

可通过一次性使用所有可用存储磁盘，指定存储磁盘，通过yaml文件3种方式部署OSD

ceph orch apply osd --all-available-devices
ceph orch daemon add osd *<host>*:*<device-path>*
ceph orch apply osd -i spec.yml

[root@ceph1 ~]# ceph orch daemon add osd ceph1:/dev/sdb
Created osd(s) 0 on host 'ceph1'
[root@ceph1 ~]#
[root@ceph1 ~]# ceph orch daemon add osd ceph2:/dev/sdb
Created osd(s) 1 on host 'ceph2'
[root@ceph1 ~]#
[root@ceph1 ~]# ceph orch daemon add osd ceph3:/dev/sdb
Created osd(s) 2 on host 'ceph3'
[root@ceph1 ~]#
[root@ceph1 ~]#
[root@ceph1 ~]# ceph orch device ls
Hostname  Path      Type  Serial  Size   Health   Ident  Fault  Available
ceph1     /dev/sdc  hdd           53.6G  Unknown  N/A    N/A    Yes
ceph1     /dev/sdb  hdd           53.6G  Unknown  N/A    N/A    No
ceph2     /dev/sdc  hdd           53.6G  Unknown  N/A    N/A    Yes
ceph2     /dev/sdb  hdd           53.6G  Unknown  N/A    N/A    No
ceph3     /dev/sdc  hdd           53.6G  Unknown  N/A    N/A    Yes
ceph3     /dev/sdb  hdd           53.6G  Unknown  N/A    N/A    No
ceph4     /dev/sdb  hdd           53.6G  Unknown  N/A    N/A    Yes
ceph4     /dev/sdc  hdd           53.6G  Unknown  N/A    N/A    Yes
ceph5     /dev/sdb  hdd           53.6G  Unknown  N/A    N/A    Yes
ceph5     /dev/sdc  hdd           53.6G  Unknown  N/A    N/A    Yes
ceph5     /dev/fd0  hdd           4096   Unknown  N/A    N/A    No
[root@ceph1 ~]#
[root@ceph1 ~]# ceph -s
  cluster:
    id:     7629fef6-1943-11ee-9b2d-000c29dbfa5e
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum ceph1,ceph3,ceph2 (age 30m)
    mgr: ceph1.xxracu(active, since 5h), standbys: ceph2.zhftxe
    osd: 3 osds: 3 up (since 44s), 3 in (since 44s)

  task status:

  data:
    pools:   1 pools, 1 pgs
    objects: 0 objects, 0 B
    usage:   3.0 GiB used, 147 GiB / 150 GiB avail
    pgs:     1 active+clean

6.部署MDSS

ceph文件存储需要额外的守护进程mds，通过命令进行部署，本次3节点部署

ceph orch apply mds *<fs-name>* --placement="*<num-daemons>* [*<host1>* ...]"

部署完成后可以看到每个节点上重新拉起一个mds的容器，通过ceph -s查看集群时会出现mds信息

[root@ceph1 ceph]# ceph orch apply mds cephfs --placement="3 ceph1 ceph2 ceph3"
Scheduled mds.cephfs update...
[root@ceph1 ceph]#
[root@ceph1 ceph]# ceph -s
  cluster:
    id:     7629fef6-1943-11ee-9b2d-000c29dbfa5e
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum ceph1,ceph3,ceph2 (age 3d)
    mgr: ceph1.xxracu(active, since 3d), standbys: ceph2.zhftxe, ceph3.wcysbm
    mds:  3 up:standby
    osd: 3 osds: 3 up (since 5d), 3 in (since 5d)

  data:
    pools:   1 pools, 1 pgs
    objects: 0 objects, 0 B
    usage:   3.0 GiB used, 147 GiB / 150 GiB avail
    pgs:     1 active+clean

初识Ceph之1——通过Cephadm安装部署_ceph_04

7.部署RGWS

ceph对象存储需要额外的守护进程radosgw ，需要注意的是：Radosgw 守护进程是通过监视器配置数据库配置的，而不是通过 ceph.conf 或命令行配置的。如果未进行配置（通常在 client.rgw < realmname > .< zonename > 章节），radosgw 守护进程将使用默认配置（例如，绑定到端口80）。

ceph orch apply rgw *<realm-name>* *<zone-name>* --placement="*<num-daemons>* [*<host1>* ...]"

例如，要在 ceph1和 ceph2上部署2个 rgw 守护进程，服务于 myorg 领域和 us-east-1区域:

ceph orch apply rgw myorg us-east-1 --placement="2 ceph1 ceph2"

部署完成后可以看到每个节点上重新拉起一个rgw的容器，通过ceph -s查看集群时会出现mds信息

[root@ceph1 ceph]# ceph orch apply rgw myorg us-east-1 --placement="2 ceph1 ceph2"
Scheduled rgw.myorg.us-east-1 update...
[root@ceph1 ceph]#
[root@ceph1 ceph]# ceph -s
  cluster:
    id:     7629fef6-1943-11ee-9b2d-000c29dbfa5e
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum ceph1,ceph3,ceph2 (age 3d)
    mgr: ceph1.xxracu(active, since 3d), standbys: ceph2.zhftxe, ceph3.wcysbm
    mds:  3 up:standby
    osd: 3 osds: 3 up (since 5d), 3 in (since 5d)
    rgw: 2 daemons active (myorg.us-east-1.ceph1.bslmbn, myorg.us-east-1.ceph2.baypns)

  task status:

  data:
    pools:   5 pools, 112 pgs
    objects: 233 objects, 7.2 KiB
    usage:   3.1 GiB used, 147 GiB / 150 GiB avail
    pgs:     1.786% pgs not active
             110 active+clean
             2   clean+premerge+peered

  progress:
    PG autoscaler decreasing pool 6 PGs from 32 to 8 (2m)
      [==============..............] (remaining: 2m)

初识Ceph之1——通过Cephadm安装部署_ceph_05