一、背景介绍
本次实验使用5台CentOS7.9主机,Ceph版本为Octopus,采用3*monitor,3*mgr方式进行部署,整个部署过程记录较为详细,步骤按照Ceph官方文档进行。每台主机2张网卡,内网用于Ceph集群内部通信,网段为:10.0.0.0/24;外网网段为:172.16.100.0/24,用于对外提供服务,每台主机除系统盘外还有2块50G本地磁盘,作为Ceph容量盘,整个集群信息如下表所示:
主机 | 角色 | IP |
host1 | mon,mgr,work,cephadm | 172.16.100.31/10.0.0.1 |
host2 | mon,mgr,work | 172.16.100.32/10.0.0.2 |
host3 | mon,mgr,work | 172.16.100.33/10.0.0.3 |
host4 | work | 172.16.100.34/10.0.0.4 |
host5 | work | 172.16.100.35/10.0.0.5 |
二、ceph最低硬件要求
三、安装部署
1.安装前每台物理节点准备
根据Ceph要求,每台节点需要满足以下条件:
- Systemd
- Podman or Docker for running containers
- Time synchronization (such as chrony or NTP)
- LVM2 for provisioning storage devices
配置docker加速器,以提高docker镜像仓库响应速度
mkdir -p /etc/docker
cat > /etc/docker/daemon.json <<EOF
{
"registry-mirrors": ["https://9lqbw5u6.mirror.aliyuncs.com"]
}
EOF
systemctl daemon-reload
systemctl restart docker
按照要求部署要求组件,并关闭selinux与iptables,保证每台物理界点/etc/hosts文件一致,此处步骤省略,同时每台主机安装Ceph的yum源
[root@ceph1 yum.repos.d]# cat ceph.repo
[Ceph]
name=Ceph
baseurl=https://mirrors.aliyun.com/ceph/rpm-15.2.17/el7/x86_64/
enabled=1
gpgcheck=0
[Ceph-noarch]
name=Ceph noarch packages
baseurl=https://mirrors.aliyun.com/ceph/rpm-15.2.17/el7/noarch/
enabled=1
gpgcheck=0
2.初始化一个新的集群
由于cephadm软件是通过ssh与集群内其他软件进行通讯,所以只需要在集群中第一台主机安装cephadm软件
yum install -y cephadm
安装集群内第一台监控检点
mkdir -p /etc/ceph
cephadm bootstrap --mon-ip 10.0.0.1
上述命令执行过程中,产生以下效果:
- Create a monitor and manager daemon for the new cluster on the local host.
- Generate a new SSH key for the Ceph cluster and adds it to the root user’s /root/.ssh/authorized_keys file.
- Write a minimal configuration file needed to communicate with the new cluster to /etc/ceph/ceph.conf.
- Write a copy of the client.admin administrative (privileged!) secret key to /etc/ceph/ceph.client.admin.keyring.
- Write a copy of the public key to /etc/ceph/ceph.pub.
默认初始化参数可以满足大多数场景,额外参数可参考cephadm bootstrap -h,初始化过程信息如下:
[root@ceph1 ~]# mkdir -p /etc/ceph
[root@ceph1 ~]# cephadm bootstrap --mon-ip 10.0.0.1
Verifying podman|docker is present...
Verifying lvm2 is present...
Verifying time synchronization is in place...
Unit chronyd.service is enabled and running
Repeating the final host check...
podman|docker (/usr/bin/docker) is present
systemctl is present
lvcreate is present
Unit chronyd.service is enabled and running
Host looks OK
Cluster fsid: 7629fef6-1943-11ee-9b2d-000c29dbfa5e
Verifying IP 10.0.0.1 port 3300 ...
Verifying IP 10.0.0.1 port 6789 ...
Mon IP 10.0.0.1 is in CIDR network 10.0.0.0/24
Pulling container image quay.io/ceph/ceph:v15...
Extracting ceph user uid/gid from container image...
Creating initial keys...
Creating initial monmap...
Creating mon...
Waiting for mon to start...
Waiting for mon...
mon is available
Assimilating anything we can from ceph.conf...
Generating new minimal ceph.conf...
Restarting the monitor...
Setting mon public_network...
Creating mgr...
Verifying port 9283 ...
Wrote keyring to /etc/ceph/ceph.client.admin.keyring
Wrote config to /etc/ceph/ceph.conf
Waiting for mgr to start...
Waiting for mgr...
mgr not available, waiting (1/10)...
mgr not available, waiting (2/10)...
mgr not available, waiting (3/10)...
mgr not available, waiting (4/10)...
mgr is available
Enabling cephadm module...
Waiting for the mgr to restart...
Waiting for Mgr epoch 5...
Mgr epoch 5 is available
Setting orchestrator backend to cephadm...
Generating ssh key...
Wrote public SSH key to to /etc/ceph/ceph.pub
Adding key to root@localhost's authorized_keys...
Adding host ceph1...
Deploying mon service with default placement...
Deploying mgr service with default placement...
Deploying crash service with default placement...
Enabling mgr prometheus module...
Deploying prometheus service with default placement...
Deploying grafana service with default placement...
Deploying node-exporter service with default placement...
Deploying alertmanager service with default placement...
Enabling the dashboard module...
Waiting for the mgr to restart...
Waiting for Mgr epoch 13...
Mgr epoch 13 is available
Generating a dashboard self-signed certificate...
Creating initial admin user...
Fetching dashboard port number...
Ceph Dashboard is now available at:
URL: https://ceph1:8443/
User: admin
Password: n739xvn56e
You can access the Ceph CLI with:
sudo /usr/sbin/cephadm shell --fsid 7629fef6-1943-11ee-9b2d-000c29dbfa5e -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring
Please consider enabling telemetry to help improve Ceph:
ceph telemetry on
For more information see:
https://docs.ceph.com/docs/master/mgr/telemetry/
Bootstrap complete.
初始化完成后,可以看到集群中生成的配置文件和已拉起的容器:
- ceph-mgr ceph管理程序
- ceph-monitor ceph监视器
- ceph-crash 崩溃数据收集模块
- prometheus prometheus监控组件
- grafana 监控数据展示dashboard
- alertmanager prometheus告警组件
- node_exporter prometheus节点数据收集组件
[root@ceph1 ~]# ll .ssh/
total 8
-rw------- 1 root root 595 Jul 3 09:47 authorized_keys
-rw-r--r-- 1 root root 704 Jul 1 23:45 known_hosts
[root@ceph1 ~]#
[root@ceph1 ~]#
[root@ceph1 ~]# ll /etc/ceph/
total 12
-rw------- 1 root root 63 Jul 3 09:47 ceph.client.admin.keyring
-rw-r--r-- 1 root root 167 Jul 3 09:47 ceph.conf
-rw-r--r-- 1 root root 595 Jul 3 09:47 ceph.pub
[root@ceph1 ~]#
[root@ceph1 ~]#
[root@ceph1 ~]# cat /etc/ceph/ceph.conf
# minimal ceph.conf for 7629fef6-1943-11ee-9b2d-000c29dbfa5e
[global]
fsid = 7629fef6-1943-11ee-9b2d-000c29dbfa5e
mon_host = [v2:10.0.0.1:3300/0,v1:10.0.0.1:6789/0]
[root@ceph1 ~]#
[root@ceph1 ~]#
[root@ceph1 ~]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
3601c100e5af quay.io/ceph/ceph-grafana:6.7.4 "/bin/sh -c 'grafana…" About a minute ago Up About a minute ceph-7629fef6-1943-11ee-9b2d-000c29dbfa5e-grafana.ceph1
a06674463341 quay.io/prometheus/alertmanager:v0.20.0 "/bin/alertmanager -…" About a minute ago Up About a minute ceph-7629fef6-1943-11ee-9b2d-000c29dbfa5e-alertmanager.ceph1
b12b094b3aa7 quay.io/prometheus/prometheus:v2.18.1 "/bin/prometheus --c…" About a minute ago Up About a minute ceph-7629fef6-1943-11ee-9b2d-000c29dbfa5e-prometheus.ceph1
72a29ed493f2 quay.io/prometheus/node-exporter:v0.18.1 "/bin/node_exporter …" 2 minutes ago Up 2 minutes ceph-7629fef6-1943-11ee-9b2d-000c29dbfa5e-node-exporter.ceph1
ab0333624ecf quay.io/ceph/ceph:v15 "/usr/bin/ceph-crash…" 3 minutes ago Up 3 minutes ceph-7629fef6-1943-11ee-9b2d-000c29dbfa5e-crash.ceph1
39c4584ad379 quay.io/ceph/ceph:v15 "/usr/bin/ceph-mgr -…" 5 minutes ago Up 5 minutes ceph-7629fef6-1943-11ee-9b2d-000c29dbfa5e-mgr.ceph1.xxracu
50fbda18cd48 quay.io/ceph/ceph:v15 "/usr/bin/ceph-mon -…" 5 minutes ago Up 5 minutes ceph-7629fef6-1943-11ee-9b2d-000c29dbfa5e-mon.ceph1
3.启用Ceph CLI命令工具
由于ceph1只安装了cephadm,无法直接通过ceph命令进行查看集群信息,可通过cephadm命令查看
[root@ceph1 ~]# cephadm shell
Inferring fsid 7629fef6-1943-11ee-9b2d-000c29dbfa5e
Inferring config /var/lib/ceph/7629fef6-1943-11ee-9b2d-000c29dbfa5e/mon.ceph1/config
Using recent ceph image quay.io/ceph/ceph@sha256:c08064dde4bba4e72a1f55d90ca32df9ef5aafab82efe2e0a0722444a5aaacca
[ceph: root@ceph1 /]#
[ceph: root@ceph1 /]# ceph -s
cluster:
id: 7629fef6-1943-11ee-9b2d-000c29dbfa5e
health: HEALTH_WARN
OSD count 0 < osd_pool_default_size 3
services:
mon: 1 daemons, quorum ceph1 (age 18m)
mgr: ceph1.xxracu(active, since 17m)
osd: 0 osds: 0 up, 0 in
data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0 B
usage: 0 B used, 0 B / 0 B avail
pgs:
或者使用非交互式模式查看
[root@ceph1 ~]# cephadm shell -- ceph -s
Inferring fsid 7629fef6-1943-11ee-9b2d-000c29dbfa5e
Inferring config /var/lib/ceph/7629fef6-1943-11ee-9b2d-000c29dbfa5e/mon.ceph1/config
Using recent ceph image quay.io/ceph/ceph@sha256:c08064dde4bba4e72a1f55d90ca32df9ef5aafab82efe2e0a0722444a5aaacca
cluster:
id: 7629fef6-1943-11ee-9b2d-000c29dbfa5e
health: HEALTH_WARN
OSD count 0 < osd_pool_default_size 3
services:
mon: 1 daemons, quorum ceph1 (age 19m)
mgr: ceph1.xxracu(active, since 18m)
osd: 0 osds: 0 up, 0 in
data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0 B
usage: 0 B used, 0 B / 0 B avail
pgs:
官方更推荐安装ceph-common包,它包含了所有的ceph命令
[root@ceph1 ~]# yum install -y ceph-common.x86_64
安装完成后可以通过ceph命令测试查看组件状态
[root@ceph1 ~]# ceph orch ps
NAME HOST STATUS REFRESHED AGE VERSION IMAGE NAME IMAGE ID CONTAINER ID
alertmanager.ceph1 ceph1 running (19m) 9m ago 21m 0.20.0 quay.io/prometheus/alertmanager:v0.20.0 0881eb8f169f a06674463341
crash.ceph1 ceph1 running (21m) 9m ago 21m 15.2.17 quay.io/ceph/ceph:v15 93146564743f ab0333624ecf
grafana.ceph1 ceph1 running (19m) 9m ago 20m 6.7.4 quay.io/ceph/ceph-grafana:6.7.4 557c83e11646 3601c100e5af
mgr.ceph1.xxracu ceph1 running (23m) 9m ago 23m 15.2.17 quay.io/ceph/ceph:v15 93146564743f 39c4584ad379
mon.ceph1 ceph1 running (23m) 9m ago 23m 15.2.17 quay.io/ceph/ceph:v15 93146564743f 50fbda18cd48
node-exporter.ceph1 ceph1 running (19m) 9m ago 20m 0.18.1 quay.io/prometheus/node-exporter:v0.18.1 e5a616e4b9cf 72a29ed493f2
prometheus.ceph1 ceph1 running (19m) 9m ago 19m 2.18.1 quay.io/prometheus/prometheus:v2.18.1 de242295e225 b12b094b3aa7
[root@ceph1 ~]#
[root@ceph1 ~]# ceph orch ps --daemon-type mon
NAME HOST STATUS REFRESHED AGE VERSION IMAGE NAME IMAGE ID CONTAINER ID
mon.ceph1 ceph1 running (23m) 9m ago 23m 15.2.17 quay.io/ceph/ceph:v15 93146564743f 50fbda18cd48
根据初始化信息,通过浏览器访问dashboard
4.其他节点加入集群并调整角色数量
将集群的SSH公钥安装到新加主机的root用户authorized_keys文件中,此处仅以ceph2为例
[root@ceph1 ~]# ssh-copy-id -f -i /etc/ceph/ceph.pub root@ceph2
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/etc/ceph/ceph.pub"
root@ceph2's password:
Number of key(s) added: 1
Now try logging into the machine, with: "ssh 'root@ceph2'"
and check to make sure that only the key(s) you wanted were added.
[root@ceph1 ~]#
[root@ceph1 ~]# ssh-copy-id -f -i /etc/ceph/ceph.pub root@ceph3
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/etc/ceph/ceph.pub"
root@ceph3's password:
Number of key(s) added: 1
Now try logging into the machine, with: "ssh 'root@ceph3'"
and check to make sure that only the key(s) you wanted were added.
告诉 Ceph 集群有新节点加入,需要在其他节点上安装python3,否则会出现如下报错:
[root@ceph1 ~]# ceph orch host add ceph2
Error EINVAL: Can't communicate with remote host `ceph2`, possibly because python3 is not installed there: cannot send (already closed?)
[root@ceph1 ~]#
[root@ceph1 ~]#
[root@ceph1 ~]# ceph orch host add ceph2
Added host 'ceph2'
5台都加入集群后,会发现mon节点自动扩展为5节点,mgr节点扩展为2节点
如果你想调整默认监控节点数量,当前为5个,调整为3个,可以采用以下命令:
[root@ceph1 ~]# ceph -s
cluster:
id: 7629fef6-1943-11ee-9b2d-000c29dbfa5e
health: HEALTH_WARN
OSD count 0 < osd_pool_default_size 3
services:
mon: 5 daemons, quorum ceph1,ceph3,ceph5,ceph4,ceph2 (age 17m)
mgr: ceph1.xxracu(active, since 4h), standbys: ceph2.zhftxe
osd: 0 osds: 0 up, 0 in
data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0 B
usage: 0 B used, 0 B / 0 B avail
pgs:
[root@ceph1 ~]#
[root@ceph1 ~]# ceph orch apply mon ceph1,ceph2,ceph3
Scheduled mon update...
[root@ceph1 ~]# ceph orch apply mgr ceph1,ceph2,ceph3
Scheduled mon update...
[root@ceph1 ~]# ceph -s
cluster:
id: 7629fef6-1943-11ee-9b2d-000c29dbfa5e
health: HEALTH_OK
services:
mon: 3 daemons, quorum ceph1,ceph3,ceph2 (age 39m)
mgr: ceph1.xxracu(active, since 5h), standbys: ceph2.zhftxe, ceph3.wcysbm
osd: 3 osds: 3 up (since 9m), 3 in (since 9m)
data:
pools: 1 pools, 1 pgs
objects: 0 objects, 0 B
usage: 3.0 GiB used, 147 GiB / 150 GiB avail
pgs: 1 active+clean
5.部署OSDS
查看集群主机上的存储设备清单:
[root@ceph1 ~]# ceph orch device ls
Hostname Path Type Serial Size Health Ident Fault Available
ceph1 /dev/sdb hdd 53.6G Unknown N/A N/A Yes
ceph1 /dev/sdc hdd 53.6G Unknown N/A N/A Yes
ceph2 /dev/sdb hdd 53.6G Unknown N/A N/A Yes
ceph2 /dev/sdc hdd 53.6G Unknown N/A N/A Yes
ceph3 /dev/sdb hdd 53.6G Unknown N/A N/A Yes
ceph3 /dev/sdc hdd 53.6G Unknown N/A N/A Yes
ceph4 /dev/sdb hdd 53.6G Unknown N/A N/A Yes
ceph4 /dev/sdc hdd 53.6G Unknown N/A N/A Yes
ceph5 /dev/sdb hdd 53.6G Unknown N/A N/A Yes
ceph5 /dev/sdc hdd 53.6G Unknown N/A N/A Yes
ceph5 /dev/fd0 hdd 4096 Unknown N/A N/A No
要成为ceph的OSD设备必须满足以下条件:
- The device must have no partitions.
- The device must not have any LVM state.
- The device must not be mounted.
- The device must not contain a file system.
- The device must not contain a Ceph BlueStore OSD.
- The device must be larger than 5 GB.
可通过一次性使用所有可用存储磁盘,指定存储磁盘,通过yaml文件3种方式部署OSD
- ceph orch apply osd --all-available-devices
- ceph orch daemon add osd *<host>*:*<device-path>*
- ceph orch apply osd -i spec.yml
[root@ceph1 ~]# ceph orch daemon add osd ceph1:/dev/sdb
Created osd(s) 0 on host 'ceph1'
[root@ceph1 ~]#
[root@ceph1 ~]# ceph orch daemon add osd ceph2:/dev/sdb
Created osd(s) 1 on host 'ceph2'
[root@ceph1 ~]#
[root@ceph1 ~]# ceph orch daemon add osd ceph3:/dev/sdb
Created osd(s) 2 on host 'ceph3'
[root@ceph1 ~]#
[root@ceph1 ~]#
[root@ceph1 ~]# ceph orch device ls
Hostname Path Type Serial Size Health Ident Fault Available
ceph1 /dev/sdc hdd 53.6G Unknown N/A N/A Yes
ceph1 /dev/sdb hdd 53.6G Unknown N/A N/A No
ceph2 /dev/sdc hdd 53.6G Unknown N/A N/A Yes
ceph2 /dev/sdb hdd 53.6G Unknown N/A N/A No
ceph3 /dev/sdc hdd 53.6G Unknown N/A N/A Yes
ceph3 /dev/sdb hdd 53.6G Unknown N/A N/A No
ceph4 /dev/sdb hdd 53.6G Unknown N/A N/A Yes
ceph4 /dev/sdc hdd 53.6G Unknown N/A N/A Yes
ceph5 /dev/sdb hdd 53.6G Unknown N/A N/A Yes
ceph5 /dev/sdc hdd 53.6G Unknown N/A N/A Yes
ceph5 /dev/fd0 hdd 4096 Unknown N/A N/A No
[root@ceph1 ~]#
[root@ceph1 ~]# ceph -s
cluster:
id: 7629fef6-1943-11ee-9b2d-000c29dbfa5e
health: HEALTH_OK
services:
mon: 3 daemons, quorum ceph1,ceph3,ceph2 (age 30m)
mgr: ceph1.xxracu(active, since 5h), standbys: ceph2.zhftxe
osd: 3 osds: 3 up (since 44s), 3 in (since 44s)
task status:
data:
pools: 1 pools, 1 pgs
objects: 0 objects, 0 B
usage: 3.0 GiB used, 147 GiB / 150 GiB avail
pgs: 1 active+clean
6.部署MDSS
ceph文件存储需要额外的守护进程mds,通过命令进行部署,本次3节点部署
ceph orch apply mds *<fs-name>* --placement="*<num-daemons>* [*<host1>* ...]"
部署完成后可以看到每个节点上重新拉起一个mds的容器,通过ceph -s查看集群时会出现mds信息
[root@ceph1 ceph]# ceph orch apply mds cephfs --placement="3 ceph1 ceph2 ceph3"
Scheduled mds.cephfs update...
[root@ceph1 ceph]#
[root@ceph1 ceph]# ceph -s
cluster:
id: 7629fef6-1943-11ee-9b2d-000c29dbfa5e
health: HEALTH_OK
services:
mon: 3 daemons, quorum ceph1,ceph3,ceph2 (age 3d)
mgr: ceph1.xxracu(active, since 3d), standbys: ceph2.zhftxe, ceph3.wcysbm
mds: 3 up:standby
osd: 3 osds: 3 up (since 5d), 3 in (since 5d)
data:
pools: 1 pools, 1 pgs
objects: 0 objects, 0 B
usage: 3.0 GiB used, 147 GiB / 150 GiB avail
pgs: 1 active+clean
7.部署RGWS
ceph对象存储需要额外的守护进程radosgw ,需要注意的是:Radosgw 守护进程是通过监视器配置数据库配置的,而不是通过 ceph.conf 或命令行配置的。如果未进行配置(通常在 client.rgw < realmname > .< zonename > 章节),radosgw 守护进程将使用默认配置(例如,绑定到端口80)。
ceph orch apply rgw *<realm-name>* *<zone-name>* --placement="*<num-daemons>* [*<host1>* ...]"
例如,要在 ceph1和 ceph2上部署2个 rgw 守护进程,服务于 myorg 领域和 us-east-1区域:
ceph orch apply rgw myorg us-east-1 --placement="2 ceph1 ceph2"
部署完成后可以看到每个节点上重新拉起一个rgw的容器,通过ceph -s查看集群时会出现mds信息
[root@ceph1 ceph]# ceph orch apply rgw myorg us-east-1 --placement="2 ceph1 ceph2"
Scheduled rgw.myorg.us-east-1 update...
[root@ceph1 ceph]#
[root@ceph1 ceph]# ceph -s
cluster:
id: 7629fef6-1943-11ee-9b2d-000c29dbfa5e
health: HEALTH_OK
services:
mon: 3 daemons, quorum ceph1,ceph3,ceph2 (age 3d)
mgr: ceph1.xxracu(active, since 3d), standbys: ceph2.zhftxe, ceph3.wcysbm
mds: 3 up:standby
osd: 3 osds: 3 up (since 5d), 3 in (since 5d)
rgw: 2 daemons active (myorg.us-east-1.ceph1.bslmbn, myorg.us-east-1.ceph2.baypns)
task status:
data:
pools: 5 pools, 112 pgs
objects: 233 objects, 7.2 KiB
usage: 3.1 GiB used, 147 GiB / 150 GiB avail
pgs: 1.786% pgs not active
110 active+clean
2 clean+premerge+peered
progress:
PG autoscaler decreasing pool 6 PGs from 32 to 8 (2m)
[==============..............] (remaining: 2m)