1、实验环境
系统版本:buntu 18.04.5 LTS 内核参数:4.15.0-112-generic ceph版本:pacific/16.2.5 主机分配: #部署服务器ceph-deploy 172.168.32.101/10.0.0.101 ceph-deploy #两个ceph-mgr 管理服务器 172.168.32.102/10.0.0.102 ceph-mgr01 172.168.32.103/10.0.0.103 ceph-mgr02 #三台服务器作为ceph 集群Mon 监视服务器,每台服务器可以和ceph 集群的cluster 网络通信。 172.168.32.104/10.0.0.104 ceph-mon01 172.168.32.105/10.0.0.105 ceph-mon02 172.168.32.106/10.0.0.106 ceph-mon03 #四台服务器作为ceph 集群OSD 存储服务器,每台服务器支持两个网络,public 网络针对客户端访问,cluster 网络用于集群管理及数据同步,每台三块或以上的磁盘 172.168.32.107/10.0.0.107 ceph-node01 172.168.32.108/10.0.0.108 ceph-node02 172.168.32.109/10.0.0.109 ceph-node03 172.168.32.110/10.0.0.110 ceph-node04 #磁盘划分 #/dev/sdb /dev/sdc /dev/sdd /dev/sde #20G
2、系统环境初始化
1)所有节点更换为清华源
cat >/etc/apt/source.list<<EOF # 默认注释了源码镜像以提高 apt update 速度,如有需要可自行取消注释 deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic main restricted universe multiverse # deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic main restricted universe multiverse deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-updates main restricted universe multiverse # deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-updates main restricted universe multiverse deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-backports main restricted universe multiverse # deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-backports main restricted universe multiverse deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-security main restricted universe multiverse # deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-security main restricted universe multiverse EOF
2)所有节点安装常用软件
apt install iproute2 ntpdate tcpdump telnet traceroute nfs-kernel-server nfs-common lrzsz tree openssl libssl-dev libpcre3 libpcre3-dev zlib1g-dev ntpdate tcpdump telnet traceroute gcc openssh-server lrzsz tree openssl libssl-dev libpcre3 libpcre3-dev zlib1g-dev ntpdate tcpdump telnet traceroute iotop unzip zip openjdk-8-jdk -y
3)所有节点的内核配置
cat >/etc/sysctl.conf <<EOF # Controls source route verification net.ipv4.conf.default.rp_filter = 1 net.ipv4.ip_nonlocal_bind = 1 net.ipv4.ip_forward = 1 # Do not accept source routing net.ipv4.conf.default.accept_source_route = 0 # Controls the System Request debugging functionality of the kernel kernel.sysrq = 0 # Controls whether core dumps will append the PID to the core filename. # Useful for debugging multi-threaded applications. kernel.core_uses_pid = 1 # Controls the use of TCP syncookies net.ipv4.tcp_syncookies = 1 # Disable netfilter on bridges. net.bridge.bridge-nf-call-ip6tables = 0 net.bridge.bridge-nf-call-iptables = 0 net.bridge.bridge-nf-call-arptables = 0 # Controls the default maxmimum size of a mesage queue kernel.msgmnb = 65536 # # Controls the maximum size of a message, in bytes kernel.msgmax = 65536 # Controls the maximum shared segment size, in bytes kernel.shmmax = 68719476736 # # Controls the maximum number of shared memory segments, in pages kernel.shmall = 4294967296 # TCP kernel paramater net.ipv4.tcp_mem = 786432 1048576 1572864 net.ipv4.tcp_rmem = 4096 87380 4194304 net.ipv4.tcp_wmem = 4096 16384 4194304 n et.ipv4.tcp_window_scaling = 1 net.ipv4.tcp_sack = 1 # socket buffer net.core.wmem_default = 8388608 net.core.rmem_default = 8388608 net.core.rmem_max = 16777216 net.core.wmem_max = 16777216 net.core.netdev_max_backlog = 262144 net.core.somaxconn = 20480 net.core.optmem_max = 81920 # TCP conn net.ipv4.tcp_max_syn_backlog = 262144 net.ipv4.tcp_syn_retries = 3 net.ipv4.tcp_retries1 = 3 net.ipv4.tcp_retries2 = 15 # tcp conn reuse net.ipv4.tcp_timestamps = 0 net.ipv4.tcp_tw_reuse = 0 net.ipv4.tcp_tw_recycle = 0 net.ipv4.tcp_fin_timeout = 1 net.ipv4.tcp_max_tw_buckets = 20000 net.ipv4.tcp_max_orphans = 3276800 net.ipv4.tcp_synack_retries = 1 net.ipv4.tcp_syncookies = 1 # keepalive conn net.ipv4.tcp_keepalive_time = 300 net.ipv4.tcp_keepalive_intvl = 30 net.ipv4.tcp_keepalive_probes = 3 net.ipv4.ip_local_port_range = 10001 65000 # swap vm.overcommit_memory = 0 vm.swappiness = 10 #net.ipv4.conf.eth1.rp_filter = 0 #net.ipv4.conf.lo.arp_ignore = 1 #net.ipv4.conf.lo.arp_announce = 2 #net.ipv4.conf.all.arp_ignore = 1 #net.ipv4.conf.all.arp_announce = 2 EOF
4)所有节点的文件权限配置
cat > /etc/security/limits.conf <<EOF root soft core unlimited root hard core unlimited root soft nproc 1000000 root hard nproc 1000000 root soft nofile 1000000 root hard nofile 1000000 root soft memlock 32000 root hard memlock 32000 root soft msgqueue 8192000 root hard msgqueue 8192000 * soft core unlimited * hard core unlimited * soft nproc 1000000 * hard nproc 1000000 * soft nofile 1000000 * hard nofile 1000000 * soft memlock 32000 * hard memlock 32000 * soft msgqueue 8192000 * hard msgqueue 8192000 EOF
5)所有节点的时间同步配置
#安装cron并启动 apt install cron -y systemctl status cron.service #同步时间 /usr/sbin/ntpdate time1.aliyun.com &> /dev/null && hwclock -w #每5分钟同步一次时间 echo "*/5 * * * * /usr/sbin/ntpdate time1.aliyun.com &> /dev/null && hwclock -w" >> /var/spool/cron/crontabs/root
6)所有节点/etc/hosts配置
cat >>/etc/hosts<<EOF 172.168.32.101 ceph-deploy 172.168.32.102 ceph-mgr01 172.168.32.103 ceph-mgr02 172.168.32.104 ceph-mon01 172.168.32.105 ceph-mon02 172.168.32.106 ceph-mon03 172.168.32.107 ceph-node01 172.168.32.108 ceph-node02 172.168.32.109 ceph-node03 172.168.32.110 ceph-node04 EOF
7)所有节点安装python2
做ceph初始化时,需要python2.7
sudo apt install python2.7 -y sudo ln -sv /usr/bin/python2.7 /usr/bin/python2
8)所有节点安装ceph-common
用于ceph的管理密钥的分发
sudo apt install ceph-common -y
3、ceph部署
1)所有节点配置ceph yum 仓库,并导入key
#配置ceph仓库 sudo echo "deb https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-pacific bionic main" >> /etc/apt/sources.list #更新仓库源 apt update #导入key wget -q -O- 'https://mirrors.tuna.tsinghua.edu.cn/ceph/keys/release.asc' | sudo apt-key add -
2)所有节点创建ceph用户,并允许ceph 用户以sudo 执行特权命令:
推荐使用指定的普通用户部署和运行ceph 集群,普通用户只要能以非交互方式执行sudo命令执行一些特权命令即可,新版的ceph-deploy 可以指定包含root 的在内只要可以执行sudo 命令的用户,不过仍然推荐使用普通用户,比如ceph、cephuser、cephadmin 这样的用户去管理ceph 集群。
#创建ceph组和用户,并设置ceph用户密码 groupadd -r -g 2021 ceph && useradd -r -m -s /bin/bash -u 2021 -g 2021 ceph && echo ceph:123456 | chpasswd #允许ceph 用户以sudo 执行特权命令 echo "ceph ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers
3)配置免秘钥登录:
在ceph-deploy 节点配置允许以非交互的方式登录到各ceph node/mon/mgr 节点,即在ceph-deploy 节点的ceph用户生成秘钥对,然后分发公钥到各被管理节点的ceph用户。
#(1)创建ssh密钥 ceph@ceph-deploy:/tmp$ ssh-keygen Generating public/private rsa key pair. Enter file in which to save the key (/home/ceph/.ssh/id_rsa): Created directory '/home/ceph/.ssh'. Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /home/ceph/.ssh/id_rsa. Your public key has been saved in /home/ceph/.ssh/id_rsa.pub. The key fingerprint is: SHA256:Oq0Vh0Do/VUVklh3U58XNgNNkIfCIPAiXFw+ztEhbqM ceph@ceph-deploy The key's randomart image is: +---[RSA 2048]----+ | .+++ oooo=@O+| | . oo+ + ooo=.+B| | + o.O . .. ..o| | o B.+.. .| | E +S.. | | o.o | | o o | | + | | . | +----[SHA256]-----+ #(2)安装sshpass ceph@ceph-deploy:/tmp$ sudo apt install sshpass #(3)ceph-deploy节点使用ceph用户分发密钥脚本 cat >>/tmp/ssh_fenfa.sh<<EOF #!/bin/bash #目标主机列表 IP=" 172.168.32.101 172.168.32.102 172.168.32.103 172.168.32.104 172.168.32.105 172.168.32.106 172.168.32.107 172.168.32.108 172.168.32.109 172.168.32.110" for node in ${IP};do sudo sshpass -p 123456 ssh-copy-id ceph@${node} -o StrictHostKeyChecking=no &> /dev/null if [ $? -eq 0 ];then echo "${node}----> 密钥分发success完成" else echo "${node}----> 密钥分发false失败" fi done EOF #(4)使用脚本分发ssh密钥 ceph@ceph-deploy:/tmp$ sudo bash ssh_fenfa.sh 172.168.32.101----> 密钥分发success完成 172.168.32.102----> 密钥分发success完成 172.168.32.103----> 密钥分发success完成 172.168.32.104----> 密钥分发success完成 172.168.32.105----> 密钥分发success完成 172.168.32.106----> 密钥分发success完成 172.168.32.107----> 密钥分发success完成 172.168.32.108----> 密钥分发success完成 172.168.32.109----> 密钥分发success完成 172.168.32.110----> 密钥分发success完成
4)在ceph-deploy节点部署ceph-deploy工具包
ceph@ceph-deploy:~#sudo apt-cache madison ceph-deploy ceph-deploy | 2.0.1 | https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-pacific bionic/main amd64 Packages ceph-deploy | 2.0.1 | https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-pacific bionic/main i386 Packages ceph-deploy | 1.5.38-0ubuntu1 | https://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic/universe amd64 Packages ceph-deploy | 1.5.38-0ubuntu1 | https://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic/universe i386 Packages ceph@ceph-deploy:~#sudo apt install ceph-deploy
5)初始化mon节点
在管理节点初始化mon 节点
ceph@ceph-deploy:~$ mkdir ceph-cluster #保存当前集群的初始化配置信息 ceph@ceph-deploy:~$ cd ceph-cluster/ ceph@ceph-deploy:~/ceph-cluster$
前期只先初始化ceph-mon01节点,ceph-mon02和ceph-mon03在集群部署完成后,再手动添加
ceph@ceph-deploy:~/ceph-cluster$ sudo ceph-deploy new --cluster-network 10.0.0.0/16 --public-network 172.168.0.0/16 ceph-mon01 #下面内容为ceph-mon01的初始化过程 [ceph_deploy.conf][DEBUG ] found configuration file at: /home/ceph/.cephdeploy.conf [ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy new --cluster-network 10.0.0.0/16 --public-network 172.168.0.0/16 ceph-mon01 [ceph_deploy.cli][INFO ] ceph-deploy options: [ceph_deploy.cli][INFO ] username : None [ceph_deploy.cli][INFO ] verbose : False [ceph_deploy.cli][INFO ] overwrite_conf : False [ceph_deploy.cli][INFO ] quiet : False [ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7f5247421dc0> [ceph_deploy.cli][INFO ] cluster : ceph [ceph_deploy.cli][INFO ] ssh_copykey : True [ceph_deploy.cli][INFO ] mon : ['ceph-mon01'] [ceph_deploy.cli][INFO ] func : <function new at 0x7f52446d6ad0> [ceph_deploy.cli][INFO ] public_network : 172.168.0.0/16 [ceph_deploy.cli][INFO ] ceph_conf : None [ceph_deploy.cli][INFO ] cluster_network : 10.0.0.0/16 [ceph_deploy.cli][INFO ] default_release : False [ceph_deploy.cli][INFO ] fsid : None [ceph_deploy.new][DEBUG ] Creating new cluster named ceph [ceph_deploy.new][INFO ] making sure passwordless SSH succeeds [ceph-mon01][DEBUG ] connected to host: ceph-deploy [ceph-mon01][INFO ] Running command: ssh -CT -o BatchMode=yes ceph-mon01 [ceph_deploy.new][WARNIN] could not connect via SSH [ceph_deploy.new][INFO ] will connect again with password prompt root@ceph-mon01's password: [ceph-mon01][DEBUG ] connected to host: ceph-mon01 [ceph-mon01][DEBUG ] detect platform information from remote host [ceph-mon01][DEBUG ] detect machine type [ceph-mon01][WARNIN] .ssh/authorized_keys does not exist, will skip adding keys root@ceph-mon01's password: root@ceph-mon01's password: [ceph-mon01][DEBUG ] connected to host: ceph-mon01 [ceph-mon01][DEBUG ] detect platform information from remote host [ceph-mon01][DEBUG ] detect machine type [ceph-mon01][DEBUG ] find the location of an executable [ceph-mon01][INFO ] Running command: /bin/ip link show [ceph-mon01][INFO ] Running command: /bin/ip addr show [ceph-mon01][DEBUG ] IP addresses found: [u'172.168.32.104', u'10.0.0.104'] [ceph_deploy.new][DEBUG ] Resolving host ceph-mon01 [ceph_deploy.new][DEBUG ] Monitor ceph-mon01 at 172.168.32.104 [ceph_deploy.new][DEBUG ] Monitor initial members are ['ceph-mon01'] [ceph_deploy.new][DEBUG ] Monitor addrs are [u'172.168.32.104'] [ceph_deploy.new][DEBUG ] Creating a random mon key... [ceph_deploy.new][DEBUG ] Writing monitor keyring to ceph.mon.keyring... [ceph_deploy.new][DEBUG ] Writing initial config to ceph.conf...
初始化为验证
ceph@ceph-deploy:~/ceph-cluster$ ll total 24 drwxrwxr-x 2 ceph ceph 75 Aug 17 22:02 ./ drwxr-xr-x 6 ceph ceph 178 Aug 17 21:58 ../ -rw-r--r-- 1 root root 264 Aug 17 22:02 ceph.conf #自动生成的配置文件 -rw-r--r-- 1 root root 14190 Aug 17 22:02 ceph-deploy-ceph.log #初始化日志 -rw------- 1 root root 73 Aug 17 22:02 ceph.mon.keyring #用于ceph mon 节点内部通讯认证的秘钥环文件 ceph@ceph-deploy:~/ceph-cluster$ cat ceph.conf [global] fsid = f0e7c394-989b-4803-86c3-5557ae25e814 #ceph集群ID public_network = 172.168.0.0/16 cluster_network = 10.0.0.0/16 mon_initial_members = ceph-mon01 #可以用逗号做分割添加多个mon节点 mon_host = 172.168.32.104 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx
6)初始化ceph-node节点
初始化ceph-node节点
ceph@ceph-deploy:~/ceph-cluster$ sudo ceph-deploy install --no-adjust-repos --nogpgcheck ceph-node01 ceph-node02 ceph-node03 ceph-node04
此过程会在指定的ceph node 节点按照串行的方式逐个服务器安装epel 源和ceph 源并安装ceph所需软件
#...... [ceph-node03][DEBUG ] The following additional packages will be installed: [ceph-node03][DEBUG ] ceph-base ceph-common ceph-mgr ceph-mgr-modules-core libaio1 libbabeltrace1 [ceph-node03][DEBUG ] libcephfs2 libdw1 libgoogle-perftools4 libibverbs1 libjaeger libjs-jquery [ceph-node03][DEBUG ] libleveldb1v5 liblttng-ust-ctl4 liblttng-ust0 liblua5.3-0 libnl-route-3-200 [ceph-node03][DEBUG ] liboath0 librabbitmq4 librados2 libradosstriper1 librbd1 librdkafka1 [ceph-node03][DEBUG ] librdmacm1 librgw2 libsnappy1v5 libtcmalloc-minimal4 liburcu6 [ceph-node03][DEBUG ] python-pastedeploy-tpl python3-bcrypt python3-bs4 python3-ceph-argparse [ceph-node03][DEBUG ] python3-ceph-common python3-cephfs python3-cherrypy3 python3-dateutil [ceph-node03][DEBUG ] python3-distutils python3-jwt python3-lib2to3 python3-logutils python3-mako [ceph-node03][DEBUG ] python3-markupsafe python3-paste python3-pastedeploy python3-pecan [ceph-node03][DEBUG ] python3-prettytable python3-rados python3-rbd python3-rgw [ceph-node03][DEBUG ] python3-simplegeneric python3-singledispatch python3-tempita [ceph-node03][DEBUG ] python3-waitress python3-webob python3-webtest python3-werkzeug [ceph-node03][DEBUG ] Suggested packages: [ceph-node03][DEBUG ] python3-influxdb python3-crypto python3-beaker python-mako-doc httpd-wsgi [ceph-node03][DEBUG ] libapache2-mod-python libapache2-mod-scgi libjs-mochikit python-pecan-doc [ceph-node03][DEBUG ] python-waitress-doc python-webob-doc python-webtest-doc ipython3 [ceph-node03][DEBUG ] python3-lxml python3-termcolor python3-watchdog python-werkzeug-doc [ceph-node03][DEBUG ] Recommended packages: [ceph-node03][DEBUG ] ntp | time-daemon ceph-fuse ceph-mgr-dashboard ceph-mgr-diskprediction-local [ceph-node03][DEBUG ] ceph-mgr-k8sevents ceph-mgr-cephadm nvme-cli smartmontools ibverbs-providers [ceph-node03][DEBUG ] javascript-common python3-lxml python3-routes python3-simplejson [ceph-node03][DEBUG ] python3-pastescript python3-pyinotify [ceph-node03][DEBUG ] The following NEW packages will be installed: [ceph-node03][DEBUG ] ceph ceph-base ceph-common ceph-mds ceph-mgr ceph-mgr-modules-core ceph-mon [ceph-node03][DEBUG ] ceph-osd libaio1 libbabeltrace1 libcephfs2 libdw1 libgoogle-perftools4 [ceph-node03][DEBUG ] libibverbs1 libjaeger libjs-jquery libleveldb1v5 liblttng-ust-ctl4 [ceph-node03][DEBUG ] liblttng-ust0 liblua5.3-0 libnl-route-3-200 liboath0 librabbitmq4 librados2 [ceph-node03][DEBUG ] libradosstriper1 librbd1 librdkafka1 librdmacm1 librgw2 libsnappy1v5 [ceph-node03][DEBUG ] libtcmalloc-minimal4 liburcu6 python-pastedeploy-tpl python3-bcrypt [ceph-node03][DEBUG ] python3-bs4 python3-ceph-argparse python3-ceph-common python3-cephfs [ceph-node03][DEBUG ] python3-cherrypy3 python3-dateutil python3-distutils python3-jwt [ceph-node03][DEBUG ] python3-lib2to3 python3-logutils python3-mako python3-markupsafe [ceph-node03][DEBUG ] python3-paste python3-pastedeploy python3-pecan python3-prettytable [ceph-node03][DEBUG ] python3-rados python3-rbd python3-rgw python3-simplegeneric [ceph-node03][DEBUG ] python3-singledispatch python3-tempita python3-waitress python3-webob [ceph-node03][DEBUG ] python3-webtest python3-werkzeug radosgw #......
7)配置mon 节点并生成及同步秘钥
在各mon 节点按照组件ceph-mon,并通初始化mon 节点,mon 节点ha 还可以后期横向扩容。
root@ceph-mon01:~# apt install ceph-mon root@ceph-mon02:~# apt install ceph-mon root@ceph-mon03:~# apt install ceph-mon
在ceph-deploy节点初始化mon节点
ceph@ceph-deploy:~/ceph-cluster$ sudo ceph-deploy mon create-initial #下面为mon节点初始化内容 [ceph_deploy.conf][DEBUG ] found configuration file at: /home/ceph/.cephdeploy.conf [ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy mon create-initial [ceph_deploy.cli][INFO ] ceph-deploy options: [ceph_deploy.cli][INFO ] username : None [ceph_deploy.cli][INFO ] verbose : False [ceph_deploy.cli][INFO ] overwrite_conf : False [ceph_deploy.cli][INFO ] subcommand : create-initial [ceph_deploy.cli][INFO ] quiet : False [ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7f9e558e8fa0> [ceph_deploy.cli][INFO ] cluster : ceph [ceph_deploy.cli][INFO ] func : <function mon at 0x7f9e558ccad0> [ceph_deploy.cli][INFO ] ceph_conf : None [ceph_deploy.cli][INFO ] keyrings : None [ceph_deploy.cli][INFO ] default_release : False [ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts ceph-mon01 [ceph_deploy.mon][DEBUG ] detecting platform for host ceph-mon01 ... root@ceph-mon01's password: root@ceph-mon01's password: [ceph-mon01][DEBUG ] connected to host: ceph-mon01 [ceph-mon01][DEBUG ] detect platform information from remote host [ceph-mon01][DEBUG ] detect machine type [ceph-mon01][DEBUG ] find the location of an executable [ceph_deploy.mon][INFO ] distro info: Ubuntu 18.04 bionic [ceph-mon01][DEBUG ] determining if provided host has same hostname in remote [ceph-mon01][DEBUG ] get remote short hostname [ceph-mon01][DEBUG ] deploying mon to ceph-mon01 [ceph-mon01][DEBUG ] get remote short hostname [ceph-mon01][DEBUG ] remote hostname: ceph-mon01 [ceph-mon01][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf [ceph-mon01][DEBUG ] create the mon path if it does not exist [ceph-mon01][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-ceph-mon01/done [ceph-mon01][DEBUG ] done path does not exist: /var/lib/ceph/mon/ceph-ceph-mon01/done [ceph-mon01][INFO ] creating keyring file: /var/lib/ceph/tmp/ceph-ceph-mon01.mon.keyring [ceph-mon01][DEBUG ] create the monitor keyring file [ceph-mon01][INFO ] Running command: ceph-mon --cluster ceph --mkfs -i ceph-mon01 --keyring /var/lib/ceph/tmp/ceph-ceph-mon01.mon.keyring --setuser 2021 --setgroup 2021 [ceph-mon01][INFO ] unlinking keyring file /var/lib/ceph/tmp/ceph-ceph-mon01.mon.keyring [ceph-mon01][DEBUG ] create a done file to avoid re-doing the mon deployment [ceph-mon01][DEBUG ] create the init path if it does not exist [ceph-mon01][INFO ] Running command: systemctl enable ceph.target [ceph-mon01][INFO ] Running command: systemctl enable ceph-mon@ceph-mon01 [ceph-mon01][WARNIN] Created symlink /etc/systemd/system/ceph-mon.target.wants/ceph-mon@ceph-mon01.service → /lib/systemd/system/ceph-mon@.service. [ceph-mon01][INFO ] Running command: systemctl start ceph-mon@ceph-mon01 [ceph-mon01][INFO ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-mon01.asok mon_status [ceph-mon01][DEBUG ] ******************************************************************************** [ceph-mon01][DEBUG ] status for monitor: mon.ceph-mon01 [ceph-mon01][DEBUG ] { [ceph-mon01][DEBUG ] "election_epoch": 3, [ceph-mon01][DEBUG ] "extra_probe_peers": [], [ceph-mon01][DEBUG ] "feature_map": { [ceph-mon01][DEBUG ] "mon": [ [ceph-mon01][DEBUG ] { [ceph-mon01][DEBUG ] "features": "0x3f01cfb9fffdffff", [ceph-mon01][DEBUG ] "num": 1, [ceph-mon01][DEBUG ] "release": "luminous" [ceph-mon01][DEBUG ] } [ceph-mon01][DEBUG ] ] [ceph-mon01][DEBUG ] }, [ceph-mon01][DEBUG ] "features": { [ceph-mon01][DEBUG ] "quorum_con": "4540138297136906239", [ceph-mon01][DEBUG ] "quorum_mon": [ [ceph-mon01][DEBUG ] "kraken", [ceph-mon01][DEBUG ] "luminous", [ceph-mon01][DEBUG ] "mimic", [ceph-mon01][DEBUG ] "osdmap-prune", [ceph-mon01][DEBUG ] "nautilus", [ceph-mon01][DEBUG ] "octopus", [ceph-mon01][DEBUG ] "pacific", [ceph-mon01][DEBUG ] "elector-pinging" [ceph-mon01][DEBUG ] ], [ceph-mon01][DEBUG ] "required_con": "2449958747317026820", [ceph-mon01][DEBUG ] "required_mon": [ [ceph-mon01][DEBUG ] "kraken", [ceph-mon01][DEBUG ] "luminous", [ceph-mon01][DEBUG ] "mimic", [ceph-mon01][DEBUG ] "osdmap-prune", [ceph-mon01][DEBUG ] "nautilus", [ceph-mon01][DEBUG ] "octopus", [ceph-mon01][DEBUG ] "pacific", [ceph-mon01][DEBUG ] "elector-pinging" [ceph-mon01][DEBUG ] ] [ceph-mon01][DEBUG ] }, [ceph-mon01][DEBUG ] "monmap": { [ceph-mon01][DEBUG ] "created": "2021-08-17T14:43:20.965196Z", [ceph-mon01][DEBUG ] "disallowed_leaders: ": "", [ceph-mon01][DEBUG ] "election_strategy": 1, [ceph-mon01][DEBUG ] "epoch": 1, [ceph-mon01][DEBUG ] "features": { [ceph-mon01][DEBUG ] "optional": [], [ceph-mon01][DEBUG ] "persistent": [ [ceph-mon01][DEBUG ] "kraken", [ceph-mon01][DEBUG ] "luminous", [ceph-mon01][DEBUG ] "mimic", [ceph-mon01][DEBUG ] "osdmap-prune", [ceph-mon01][DEBUG ] "nautilus", [ceph-mon01][DEBUG ] "octopus", [ceph-mon01][DEBUG ] "pacific", [ceph-mon01][DEBUG ] "elector-pinging" [ceph-mon01][DEBUG ] ] [ceph-mon01][DEBUG ] }, [ceph-mon01][DEBUG ] "fsid": "f0e7c394-989b-4803-86c3-5557ae25e814", [ceph-mon01][DEBUG ] "min_mon_release": 16, [ceph-mon01][DEBUG ] "min_mon_release_name": "pacific", [ceph-mon01][DEBUG ] "modified": "2021-08-17T14:43:20.965196Z", [ceph-mon01][DEBUG ] "mons": [ [ceph-mon01][DEBUG ] { [ceph-mon01][DEBUG ] "addr": "172.168.32.104:6789/0", [ceph-mon01][DEBUG ] "crush_location": "{}", [ceph-mon01][DEBUG ] "name": "ceph-mon01", [ceph-mon01][DEBUG ] "priority": 0, [ceph-mon01][DEBUG ] "public_addr": "172.168.32.104:6789/0", [ceph-mon01][DEBUG ] "public_addrs": { [ceph-mon01][DEBUG ] "addrvec": [ [ceph-mon01][DEBUG ] { [ceph-mon01][DEBUG ] "addr": "172.168.32.104:3300", [ceph-mon01][DEBUG ] "nonce": 0, [ceph-mon01][DEBUG ] "type": "v2" [ceph-mon01][DEBUG ] }, [ceph-mon01][DEBUG ] { [ceph-mon01][DEBUG ] "addr": "172.168.32.104:6789", [ceph-mon01][DEBUG ] "nonce": 0, [ceph-mon01][DEBUG ] "type": "v1" [ceph-mon01][DEBUG ] } [ceph-mon01][DEBUG ] ] [ceph-mon01][DEBUG ] }, [ceph-mon01][DEBUG ] "rank": 0, [ceph-mon01][DEBUG ] "weight": 0 [ceph-mon01][DEBUG ] } [ceph-mon01][DEBUG ] ], [ceph-mon01][DEBUG ] "stretch_mode": false [ceph-mon01][DEBUG ] }, [ceph-mon01][DEBUG ] "name": "ceph-mon01", [ceph-mon01][DEBUG ] "outside_quorum": [], [ceph-mon01][DEBUG ] "quorum": [ [ceph-mon01][DEBUG ] 0 [ceph-mon01][DEBUG ] ], [ceph-mon01][DEBUG ] "quorum_age": 2, [ceph-mon01][DEBUG ] "rank": 0, [ceph-mon01][DEBUG ] "state": "leader", [ceph-mon01][DEBUG ] "stretch_mode": false, [ceph-mon01][DEBUG ] "sync_provider": [] [ceph-mon01][DEBUG ] } [ceph-mon01][DEBUG ] ******************************************************************************** [ceph-mon01][INFO ] monitor: mon.ceph-mon01 is running [ceph-mon01][INFO ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-mon01.asok mon_status [ceph_deploy.mon][INFO ] processing monitor mon.ceph-mon01 root@ceph-mon01's password: root@ceph-mon01's password: [ceph-mon01][DEBUG ] connected to host: ceph-mon01 [ceph-mon01][DEBUG ] detect platform information from remote host [ceph-mon01][DEBUG ] detect machine type [ceph-mon01][DEBUG ] find the location of an executable [ceph-mon01][INFO ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-mon01.asok mon_status [ceph_deploy.mon][INFO ] mon.ceph-mon01 monitor has reached quorum! [ceph_deploy.mon][INFO ] all initial monitors are running and have formed quorum [ceph_deploy.mon][INFO ] Running gatherkeys... [ceph_deploy.gatherkeys][INFO ] Storing keys in temp directory /tmp/tmptJnBKt root@ceph-mon01's password: root@ceph-mon01's password: [ceph-mon01][DEBUG ] connected to host: ceph-mon01 [ceph-mon01][DEBUG ] detect platform information from remote host [ceph-mon01][DEBUG ] detect machine type [ceph-mon01][DEBUG ] get remote short hostname [ceph-mon01][DEBUG ] fetch remote file [ceph-mon01][INFO ] Running command: /usr/bin/ceph --connect-timeout=25 --cluster=ceph --admin-daemon=/var/run/ceph/ceph-mon.ceph-mon01.asok mon_status [ceph-mon01][INFO ] Running command: /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-mon01/keyring auth get client.admin [ceph-mon01][INFO ] Running command: /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-mon01/keyring auth get client.bootstrap-mds [ceph-mon01][INFO ] Running command: /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-mon01/keyring auth get client.bootstrap-mgr [ceph-mon01][INFO ] Running command: /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-mon01/keyring auth get client.bootstrap-osd [ceph-mon01][INFO ] Running command: /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-mon01/keyring auth get client.bootstrap-rgw [ceph_deploy.gatherkeys][INFO ] Storing ceph.client.admin.keyring [ceph_deploy.gatherkeys][INFO ] Storing ceph.bootstrap-mds.keyring [ceph_deploy.gatherkeys][INFO ] Storing ceph.bootstrap-mgr.keyring [ceph_deploy.gatherkeys][INFO ] keyring 'ceph.mon.keyring' already exists [ceph_deploy.gatherkeys][INFO ] Storing ceph.bootstrap-osd.keyring [ceph_deploy.gatherkeys][INFO ] Storing ceph.bootstrap-rgw.keyring [ceph_deploy.gatherkeys][INFO ] Destroy temp directory /tmp/tmptJnBKt
8)验证mon 节点
验证在mon 定节点已经自动安装并启动了ceph-mon 服务,并且后期在ceph-deploy 节点初始化目录会生成一些bootstrap ceph mds/mgr/osd/rgw 等服务的keyring 认证文件,这些初始化文件拥有对ceph 集群的最高权限,所以一定要保存好。
root@ceph-mon01:~# ps -ef|grep ceph-mon ceph 8304 1 0 22:43 ? 00:00:00 /usr/bin/ceph-mon -f --cluster ceph --id ceph-mon01 --setuser ceph --setgroup ceph
9)分发admin 秘钥到node节点
在ceph-deploy 节点把配置文件和admin 密钥拷贝至Ceph 集群需要执行ceph 管理命令的节点,从而不需要后期通过ceph 命令对ceph 集群进行管理配置的时候每次都需要指定ceph-mon 节点地址和ceph.client.admin.keyring 文件,另外各ceph-mon 节点也需要同步ceph 的集群配置文件与认证文件。
如果在ceph-deploy 节点管理集群
root@ceph-deploy:~#sudo apt install ceph-common #先安装ceph 的公共组件 root@ceph-node01:~#sudo apt install ceph-common -y root@ceph-node02:~#sudo apt install ceph-common -y root@ceph-node03:~#sudo apt install ceph-common -y root@ceph-node04:~#sudo apt install ceph-common -y
拷贝密钥到ceph-node01 ceph-node02 ceph-node03 ceph-node04
#在ceph-deploy上操作 ceph@ceph-deploy:~/ceph-cluster$ sudo ceph-deploy admin ceph-node01 ceph-node02 ceph-node03 ceph-node04 #下面为运行过程 [ceph_deploy.conf][DEBUG ] found configuration file at: /home/ceph/.cephdeploy.conf [ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy admin ceph-node01 ceph-node02 ceph-node03 ceph-node04 [ceph_deploy.cli][INFO ] ceph-deploy options: [ceph_deploy.cli][INFO ] username : None [ceph_deploy.cli][INFO ] verbose : False [ceph_deploy.cli][INFO ] overwrite_conf : False [ceph_deploy.cli][INFO ] quiet : False [ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7f0c31320140> [ceph_deploy.cli][INFO ] cluster : ceph [ceph_deploy.cli][INFO ] client : ['ceph-node01', 'ceph-node02', 'ceph-node03', 'ceph-node04'] [ceph_deploy.cli][INFO ] func : <function admin at 0x7f0c31c22a50> [ceph_deploy.cli][INFO ] ceph_conf : None [ceph_deploy.cli][INFO ] default_release : False [ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to ceph-node01 root@ceph-node01's password: root@ceph-node01's password: [ceph-node01][DEBUG ] connected to host: ceph-node01 [ceph-node01][DEBUG ] detect platform information from remote host [ceph-node01][DEBUG ] detect machine type [ceph-node01][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf [ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to ceph-node02 root@ceph-node02's password: root@ceph-node02's password: [ceph-node02][DEBUG ] connected to host: ceph-node02 [ceph-node02][DEBUG ] detect platform information from remote host [ceph-node02][DEBUG ] detect machine type [ceph-node02][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf [ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to ceph-node03 root@ceph-node03's password: root@ceph-node03's password: [ceph-node03][DEBUG ] connected to host: ceph-node03 [ceph-node03][DEBUG ] detect platform information from remote host [ceph-node03][DEBUG ] detect machine type [ceph-node03][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf [ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to ceph-node04 The authenticity of host 'ceph-node04 (172.168.32.110)' can't be established. ECDSA key fingerprint is SHA256:x78X2D2e8HdqmZB3tZFTSlA3URPH7LgbigGuNbDwnLU. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'ceph-node04' (ECDSA) to the list of known hosts. root@ceph-node04's password: root@ceph-node04's password: [ceph-node04][DEBUG ] connected to host: ceph-node04 [ceph-node04][DEBUG ] detect platform information from remote host [ceph-node04][DEBUG ] detect machine type [ceph-node04][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
10)验证密钥
root@ceph-node01:~# ll /etc/ceph total 24 drwxr-xr-x 2 root root 87 Aug 17 22:54 ./ drwxr-xr-x 98 root root 8192 Aug 17 22:16 ../ -rw------- 1 root root 151 Aug 17 22:54 ceph.client.admin.keyring -rw-r--r-- 1 root root 264 Aug 17 22:54 ceph.conf -rw-r--r-- 1 root root 92 Jul 8 22:17 rbdmap -rw------- 1 root root 0 Aug 17 22:54 tmp8AdP3P root@ceph-node02:~# ll /etc/ceph total 24 drwxr-xr-x 2 root root 87 Aug 17 22:54 ./ drwxr-xr-x 98 root root 8192 Aug 17 22:16 ../ -rw------- 1 root root 151 Aug 17 22:54 ceph.client.admin.keyring -rw-r--r-- 1 root root 264 Aug 17 22:54 ceph.conf -rw-r--r-- 1 root root 92 Jul 8 22:17 rbdmap -rw------- 1 root root 0 Aug 17 22:54 tmp8AdP3P root@ceph-node03:~# ll /etc/ceph total 24 drwxr-xr-x 2 root root 87 Aug 17 22:54 ./ drwxr-xr-x 98 root root 8192 Aug 17 22:16 ../ -rw------- 1 root root 151 Aug 17 22:54 ceph.client.admin.keyring -rw-r--r-- 1 root root 264 Aug 17 22:54 ceph.conf -rw-r--r-- 1 root root 92 Jul 8 22:17 rbdmap -rw------- 1 root root 0 Aug 17 22:54 tmp8AdP3P root@ceph-node04:~# ll /etc/ceph total 24 drwxr-xr-x 2 root root 87 Aug 17 22:54 ./ drwxr-xr-x 98 root root 8192 Aug 17 22:16 ../ -rw------- 1 root root 151 Aug 17 22:54 ceph.client.admin.keyring -rw-r--r-- 1 root root 264 Aug 17 22:54 ceph.conf -rw-r--r-- 1 root root 92 Jul 8 22:17 rbdmap -rw------- 1 root root 0 Aug 17 22:54 tmp8AdP3P
11)在ceph-node节点对密钥授权
认证文件的属主和属组为了安全考虑,默认设置为了root 用户和root 组,如果需要ceph用户也能执行ceph 命令,那么就需要对ceph 用户进行授权。
root@ceph-node01:~# setfacl -m u:ceph:rw /etc/ceph/ceph.client.admin.keyring root@ceph-node02:~# setfacl -m u:ceph:rw /etc/ceph/ceph.client.admin.keyring root@ceph-node03:~# setfacl -m u:ceph:rw /etc/ceph/ceph.client.admin.keyring root@ceph-node04:~# setfacl -m u:ceph:rw /etc/ceph/ceph.client.admin.keyring
12)部署ceph-mgr节点
(1)在ceph-mgr01和ceph-mgr02上安装ceph-mgr软件
root@ceph-mgr01:~# apt install -y ceph-mgr root@ceph-mgr02:~# apt install -y ceph-mgr
(2)在ceph-deploy上初始化ceph-mgr节点
#只初始化ceph-mgr01节点 ceph@ceph-deploy:~/ceph-cluster$ sudo ceph-deploy mgr create ceph-mgr01 #下面为过程内容 [ceph_deploy.conf][DEBUG ] found configuration file at: /home/ceph/.cephdeploy.conf [ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy mgr create ceph-mgr01 [ceph_deploy.cli][INFO ] ceph-deploy options: [ceph_deploy.cli][INFO ] username : None [ceph_deploy.cli][INFO ] verbose : False [ceph_deploy.cli][INFO ] mgr : [('ceph-mgr01', 'ceph-mgr01')] [ceph_deploy.cli][INFO ] overwrite_conf : False [ceph_deploy.cli][INFO ] subcommand : create [ceph_deploy.cli][INFO ] quiet : False [ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7f97ab045c30> [ceph_deploy.cli][INFO ] cluster : ceph [ceph_deploy.cli][INFO ] func : <function mgr at 0x7f97ab4a5150> [ceph_deploy.cli][INFO ] ceph_conf : None [ceph_deploy.cli][INFO ] default_release : False [ceph_deploy.mgr][DEBUG ] Deploying mgr, cluster ceph hosts ceph-mgr01:ceph-mgr01 root@ceph-mgr01's password: root@ceph-mgr01's password: [ceph-mgr01][DEBUG ] connected to host: ceph-mgr01 [ceph-mgr01][DEBUG ] detect platform information from remote host [ceph-mgr01][DEBUG ] detect machine type [ceph_deploy.mgr][INFO ] Distro info: Ubuntu 18.04 bionic [ceph_deploy.mgr][DEBUG ] remote host will use systemd [ceph_deploy.mgr][DEBUG ] deploying mgr bootstrap to ceph-mgr01 [ceph-mgr01][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf [ceph-mgr01][DEBUG ] create path recursively if it doesn't exist [ceph-mgr01][INFO ] Running command: ceph --cluster ceph --name client.bootstrap-mgr --keyring /var/lib/ceph/bootstrap-mgr/ceph.keyring auth get-or-create mgr.ceph-mgr01 mon allow profile mgr osd allow * mds allow * -o /var/lib/ceph/mgr/ceph-ceph-mgr01/keyring [ceph-mgr01][INFO ] Running command: systemctl enable ceph-mgr@ceph-mgr01 [ceph-mgr01][WARNIN] Created symlink /etc/systemd/system/ceph-mgr.target.wants/ceph-mgr@ceph-mgr01.service → /lib/systemd/system/ceph-mgr@.service. [ceph-mgr01][INFO ] Running command: systemctl start ceph-mgr@ceph-mgr01 [ceph-mgr01][INFO ] Running command: systemctl enable ceph.target Unhandled exception in thread started by
13)验证ceph-mgr 节点
在ceph-mgr01上
root@ceph-mgr01:~# ps -ef |grep ceph root 6808 1 0 23:14 ? 00:00:00 /usr/bin/python3.6 /usr/bin/ceph-crash ceph 10425 1 7 23:16 ? 00:00:07 /usr/bin/ceph-mgr -f --cluster ceph --id ceph-mgr01 --setuser ceph --setgroup ceph root 10731 1695 0 23:18 pts/1 00:00:00 grep --color=auto ceph
14)配置ceph-deploy 管理ceph 集群
#ceph-deploy管理ceph集群环境设置 ceph@ceph-deploy:~/ceph-cluster$ sudo apt install -y ceph-common ceph@ceph-deploy:~/ceph-cluster$ sudo ceph-deploy admin ceph-deploy ceph@ceph-deploy:~/ceph-cluster$ sudo setfacl -m u:ceph:rw /etc/ceph/ceph.client.admin.keyring #ceph-deploy管理ceph集群信息 ceph@ceph-deploy:~/ceph-cluster$ ceph -s cluster: id: f0e7c394-989b-4803-86c3-5557ae25e814 health: HEALTH_WARN mon is allowing insecure global_id reclaim #需要禁用非安全模式通信 OSD count 0 < osd_pool_default_size 3 #集群的OSD 数量小于3 services: mon: 1 daemons, quorum ceph-mon01 (age 39m) mgr: ceph-mgr01(active, since 5m) osd: 0 osds: 0 up, 0 in data: pools: 0 pools, 0 pgs objects: 0 objects, 0 B usage: 0 B used, 0 B / 0 B avail pgs:
禁用非安全模式通信
ceph@ceph-deploy:~/ceph-cluster$ sudo ceph config set mon auth_allow_insecure_global_id_reclaim false ceph@ceph-deploy:~/ceph-cluster$ ceph -s cluster: id: f0e7c394-989b-4803-86c3-5557ae25e814 health: HEALTH_WARN OSD count 0 < osd_pool_default_size 3 services: mon: 1 daemons, quorum ceph-mon01 (age 42m) mgr: ceph-mgr01(active, since 8m) osd: 0 osds: 0 up, 0 in data: pools: 0 pools, 0 pgs objects: 0 objects, 0 B usage: 0 B used, 0 B / 0 B avail pgs:
ceph集群个组件版本
ceph@ceph-deploy:~/ceph-cluster$ ceph versions { "mon": { "ceph version 16.2.5 (0883bdea7337b95e4b611c768c0279868462204a) pacific (stable)": 1 }, "mgr": { "ceph version 16.2.5 (0883bdea7337b95e4b611c768c0279868462204a) pacific (stable)": 1 }, "osd": {}, "mds": {}, "overall": { "ceph version 16.2.5 (0883bdea7337b95e4b611c768c0279868462204a) pacific (stable)": 2 } }
15)准备OSD节点
#擦除磁盘之前通过deploy 节点对node 节点执行安装ceph 基本运行环境。 #在ceph-deploy上操作 ceph@ceph-deploy:~/ceph-cluster$sudo ceph-deploy install --release pacific ceph-node01 ceph@ceph-deploy:~/ceph-cluster$sudo ceph-deploy install --release pacific ceph-node02 ceph@ceph-deploy:~/ceph-cluster$sudo ceph-deploy install --release pacific ceph-node03 ceph@ceph-deploy:~/ceph-cluster$sudo ceph-deploy install --release pacific ceph-node04
16)列出ceph node 节点磁盘
ceph@ceph-deploy:~/ceph-cluster$ sudo ceph-deploy disk list ceph-node01 #下面为ceph-node01上的磁盘信息,ceph-node02 ceph-node03 ceph-node04可同样list ...... [ceph-node01][DEBUG ] connected to host: ceph-node01 [ceph-node01][DEBUG ] detect platform information from remote host [ceph-node01][DEBUG ] detect machine type [ceph-node01][DEBUG ] find the location of an executable [ceph-node01][INFO ] Running command: fdisk -l [ceph-node01][INFO ] Disk /dev/sda: 50 GiB, 53687091200 bytes, 104857600 sectors [ceph-node01][INFO ] Disk /dev/sdb: 20 GiB, 21474836480 bytes, 41943040 sectors [ceph-node01][INFO ] Disk /dev/sdc: 20 GiB, 21474836480 bytes, 41943040 sectors [ceph-node01][INFO ] Disk /dev/sdd: 20 GiB, 21474836480 bytes, 41943040 sectors [ceph-node01][INFO ] Disk /dev/sde: 20 GiB, 21474836480 bytes, 41943040 sectors
17)使用ceph-deploy disk zap 擦除各ceph node 的ceph 数据磁盘:
#在ceph-deploy上操作 sudo ceph-deploy disk zap ceph-node01 /dev/sdb sudo ceph-deploy disk zap ceph-node01 /dev/sdc sudo ceph-deploy disk zap ceph-node01 /dev/sdd sudo ceph-deploy disk zap ceph-node01 /dev/sde sudo ceph-deploy disk zap ceph-node02 /dev/sdb sudo ceph-deploy disk zap ceph-node02 /dev/sdc sudo ceph-deploy disk zap ceph-node02 /dev/sdd sudo ceph-deploy disk zap ceph-node02 /dev/sde sudo ceph-deploy disk zap ceph-node03 /dev/sdb sudo ceph-deploy disk zap ceph-node03 /dev/sdc sudo ceph-deploy disk zap ceph-node03 /dev/sdd sudo ceph-deploy disk zap ceph-node03 /dev/sde sudo ceph-deploy disk zap ceph-node04 /dev/sdb sudo ceph-deploy disk zap ceph-node04 /dev/sdc sudo ceph-deploy disk zap ceph-node04 /dev/sdd sudo ceph-deploy disk zap ceph-node04 /dev/sde
18)添加OSD
#在ceph-deploy上操作 sudo ceph-deploy osd create ceph-node01 --data /dev/sdb sudo ceph-deploy osd create ceph-node01 --data /dev/sdc sudo ceph-deploy osd create ceph-node01 --data /dev/sdd sudo ceph-deploy osd create ceph-node01 --data /dev/sde sudo ceph-deploy osd create ceph-node02 --data /dev/sdb sudo ceph-deploy osd create ceph-node02 --data /dev/sdc sudo ceph-deploy osd create ceph-node02 --data /dev/sdd sudo ceph-deploy osd create ceph-node02 --data /dev/sde sudo ceph-deploy osd create ceph-node03 --data /dev/sdb sudo ceph-deploy osd create ceph-node03 --data /dev/sdc sudo ceph-deploy osd create ceph-node03 --data /dev/sdd sudo ceph-deploy osd create ceph-node03 --data /dev/sde sudo ceph-deploy osd create ceph-node04 --data /dev/sdb sudo ceph-deploy osd create ceph-node04 --data /dev/sdc sudo ceph-deploy osd create ceph-node04 --data /dev/sdd sudo ceph-deploy osd create ceph-node04 --data /dev/sde
19)验证OSD
ceph@ceph-deploy:~/ceph-cluster$ ceph -s cluster: id: f0e7c394-989b-4803-86c3-5557ae25e814 health: HEALTH_WARN 4 osds down #有4个出了问题 1 host (4 osds) down services: mon: 1 daemons, quorum ceph-mon01 (age 96m) mgr: ceph-mgr01(active, since 62m) osd: 16 osds: 12 up (since 85s), 16 in (since 94s) data: pools: 1 pools, 1 pgs objects: 0 objects, 0 B usage: 117 MiB used, 320 GiB / 320 GiB avail pgs: 1 active+clean
20)设置OSD 服务自启动
默认就已经为自启动, node 节点添加完成后,开源测试node 服务器重启后,OSD 是否会自动启动。
#在ceph-node02上测试,其他node节点操作一样 root@ceph-node02:~# ps -ef|grep osd ceph 15521 1 0 00:08 ? 00:00:03 /usr/bin/ceph-osd -f --cluster ceph --id 3 --setuser ceph --setgroup ceph ceph 17199 1 0 00:09 ? 00:00:03 /usr/bin/ceph-osd -f --cluster ceph --id 4 --setuser ceph --setgroup ceph ceph 18874 1 0 00:09 ? 00:00:03 /usr/bin/ceph-osd -f --cluster ceph --id 5 --setuser ceph --setgroup ceph ceph 20546 1 0 00:09 ? 00:00:03 /usr/bin/ceph-osd -f --cluster ceph --id 6 --setuser ceph --setgroup ceph root@ceph-node02:~# systemctl enable ceph-osd@3 ceph-osd@4 ceph-osd@5 ceph-osd@6 Created symlink /etc/systemd/system/ceph-osd.target.wants/ceph-osd@3.service → /lib/systemd/system/ceph-osd@.service. Created symlink /etc/systemd/system/ceph-osd.target.wants/ceph-osd@4.service → /lib/systemd/system/ceph-osd@.service. Created symlink /etc/systemd/system/ceph-osd.target.wants/ceph-osd@5.service → /lib/systemd/system/ceph-osd@.service. Created symlink /etc/systemd/system/ceph-osd.target.wants/ceph-osd@6.service → /lib/systemd/system/ceph-osd@.service.
21)ceph-deploy的命令
$ ceph-deploy --help new:开始部署一个新的ceph 存储集群,并生成CLUSTER.conf 集群配置文件和keyring 认证文件。 install: 在远程主机上安装ceph 相关的软件包, 可以通过--release 指定安装的版本。 rgw:管理RGW 守护程序(RADOSGW,对象存储网关)。 mgr:管理MGR 守护程序(ceph-mgr,Ceph Manager DaemonCeph 管理器守护程序)。 mds:管理MDS 守护程序(Ceph Metadata Server,ceph 源数据服务器)。 mon:管理MON 守护程序(ceph-mon,ceph 监视器)。 gatherkeys:从指定获取提供新节点的验证keys,这些keys 会在添加新的MON/OSD/MD加入的时候使用。 disk:管理远程主机磁盘。 osd:在远程主机准备数据磁盘,即将指定远程主机的指定磁盘添加到ceph 集群作为osd 使用。 repo: 远程主机仓库管理。 admin:推送ceph 集群配置文件和client.admin 认证文件到远程主机。 config:将ceph.conf 配置文件推送到远程主机或从远程主机拷贝。 uninstall:从远端主机删除安装包。 purgedata:从/var/lib/ceph 删除ceph 数据,会删除/etc/ceph 下的内容。 purge: 删除远端主机的安装包和所有数据。 forgetkeys:从本地主机删除所有的验证keyring, 包括client.admin, monitor, bootstrap 等 认证文件。 pkg: 管理远端主机的安装包。 calamari:安装并配置一个calamari web 节点,calamari 是一个web 监控平台。
22)、Ceph部署中问题
1、磁盘有数据
处理方案:清空磁盘数据,实验中的操作是关闭虚拟机,删除原磁盘,重新添加磁盘。
[ceph_deploy.osd][DEBUG ] Creating OSD on cluster ceph with data device /dev/sdb root@ceph-node01's password: root@ceph-node01's password: [ceph-node01][DEBUG ] connected to host: ceph-node01 [ceph-node01][DEBUG ] detect platform information from remote host [ceph-node01][DEBUG ] detect machine type [ceph-node01][DEBUG ] find the location of an executable [ceph_deploy.osd][INFO ] Distro info: Ubuntu 18.04 bionic [ceph_deploy.osd][DEBUG ] Deploying osd to ceph-node01 [ceph-node01][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf [ceph-node01][DEBUG ] find the location of an executable [ceph-node01][INFO ] Running command: /usr/sbin/ceph-volume --cluster ceph lvm create --bluestore --data /dev/sdb [ceph-node01][WARNIN] Running command: /usr/bin/ceph-authtool --gen-print-key [ceph-node01][WARNIN] Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new 93330566-19f1-47a7-8d5e-5d47c52b4274 [ceph-node01][WARNIN] Running command: /sbin/lvcreate --yes -l 5119 -n osd-block-93330566-19f1-47a7-8d5e-5d47c52b4274 ceph-f3745436-e965-4dda-a440-4c7fb3104c48 [ceph-node01][WARNIN] stderr: [ceph-node01][WARNIN] stderr: Volume group "ceph-f3745436-e965-4dda-a440-4c7fb3104c48" has insufficient free space (0 extents): 5119 required. [ceph-node01][WARNIN] stderr: [ceph-node01][WARNIN] --> Was unable to complete a new OSD, will rollback changes [ceph-node01][WARNIN] Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd purge-new osd.16 --yes-i-really-mean-it [ceph-node01][WARNIN] stderr: 2021-08-18T10:03:49.653+0800 7f2ffca71700 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.bootstrap-osd.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory [ceph-node01][WARNIN] stderr: [ceph-node01][WARNIN] stderr: 2021-08-18T10:03:49.653+0800 7f2ffca71700 -1 AuthRegistry(0x7f2ff805b408) no keyring found at /etc/ceph/ceph.client.bootstrap-osd.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,, disabling cephx [ceph-node01][WARNIN] stderr: [ceph-node01][WARNIN] stderr: purged osd.16 [ceph-node01][WARNIN] --> RuntimeError: command returned non-zero exit status: 5 [ceph-node01][ERROR ] RuntimeError: command returned non-zero exit status: 1 [ceph_deploy.osd][ERROR ] Failed to execute command: /usr/sbin/ceph-volume --cluster ceph lvm create --bluestore --data /dev/sdb [ceph_deploy][ERROR ] GenericError: Failed to create 1 OSDs
4、扩展ceph集群高可用
1)扩展ceph-mon 节点
Ceph-mon 是原生具备自选举以实现高可用机制的ceph 服务,节点数量通常是奇数。
#在ceph-mon节点安装ceph-mon root@ceph-mon02:~# apt install ceph-mon root@ceph-mon03:~# apt install ceph-mon #在ceph-deploy节点添加mon节点 ceph@ceph-deploy:~/ceph-cluster$ sudo ceph-deploy mon add ceph-mon02 ceph@ceph-deploy:~/ceph-cluster$ sudo ceph-deploy mon add ceph-mon03
ceph-mon结果验证
ceph@ceph-deploy:~/ceph-cluster$ ceph -s cluster: id: f0e7c394-989b-4803-86c3-5557ae25e814 health: HEALTH_OK services: mon: 3 daemons, quorum ceph-mon01,ceph-mon02,ceph-mon03 (age 7s) #ceph-mon02和ceph-mon03添加完成 mgr: ceph-mgr01(active, since 15h) osd: 20 osds: 16 up (since 22m), 15 in (since 25m) data: pools: 1 pools, 1 pgs objects: 0 objects, 0 B usage: 134 MiB used, 300 GiB / 300 GiB avail pgs: 1 active+clean #ceph-mon的状态 ceph@ceph-deploy:~/ceph-cluster$ ceph quorum_status --format json-pretty { "election_epoch": 12, "quorum": [ 0, 1, 2 ], "quorum_names": [ "ceph-mon01", "ceph-mon02", "ceph-mon03" ], "quorum_leader_name": "ceph-mon01", #当前mon的leader "quorum_age": 176, "features": { "quorum_con": "4540138297136906239", "quorum_mon": [ "kraken", "luminous", "mimic", "osdmap-prune", "nautilus", "octopus", "pacific", "elector-pinging" ] }, "monmap": { "epoch": 3, "fsid": "f0e7c394-989b-4803-86c3-5557ae25e814", "modified": "2021-08-18T06:31:20.933855Z", "created": "2021-08-17T14:43:20.965196Z", "min_mon_release": 16, "min_mon_release_name": "pacific", "election_strategy": 1, "disallowed_leaders: ": "", "stretch_mode": false, "features": { "persistent": [ "kraken", "luminous", "mimic", "osdmap-prune", "nautilus", "octopus", "pacific", "elector-pinging" ], "optional": [] }, "mons": [ { "rank": 0, #ceph-mon01的等级 "name": "ceph-mon01", #mon的节点名称 "public_addrs": { "addrvec": [ { "type": "v2", "addr": "172.168.32.104:3300", #监控地址 "nonce": 0 }, { "type": "v1", "addr": "172.168.32.104:6789", #监控地址 "nonce": 0 } ] }, "addr": "172.168.32.104:6789/0", "public_addr": "172.168.32.104:6789/0", "priority": 0, "weight": 0, "crush_location": "{}" }, { "rank": 1, "name": "ceph-mon02", "public_addrs": { "addrvec": [ { "type": "v2", "addr": "172.168.32.105:3300", "nonce": 0 }, { "type": "v1", "addr": "172.168.32.105:6789", "nonce": 0 } ] }, "addr": "172.168.32.105:6789/0", "public_addr": "172.168.32.105:6789/0", "priority": 0, "weight": 0, "crush_location": "{}" }, { "rank": 2, "name": "ceph-mon03", "public_addrs": { "addrvec": [ { "type": "v2", "addr": "172.168.32.106:3300", "nonce": 0 }, { "type": "v1", "addr": "172.168.32.106:6789", "nonce": 0 } ] }, "addr": "172.168.32.106:6789/0", "public_addr": "172.168.32.106:6789/0", "priority": 0, "weight": 0, "crush_location": "{}" } ] } }
2)扩展mgr 节点
#在ceph-mgr02节点上安装ceph-mgr root@ceph-mgr02:~#apt install -y ceph-mgr #在ceph-deploy上添加ceph-mgr02 ceph@ceph-deploy:~/ceph-cluster$ sudo ceph-deploy mgr create ceph-mgr02 #在ceph-deploy上同步配置文件到ceph-mgr02 ceph@ceph-deploy:~/ceph-cluster$ sudo ceph-deploy admin ceph-mgr02
ceph-mgr验证
ceph@ceph-deploy:~/ceph-cluster$ ceph -s cluster: id: f0e7c394-989b-4803-86c3-5557ae25e814 health: HEALTH_OK services: mon: 3 daemons, quorum ceph-mon01,ceph-mon02,ceph-mon03 (age 12m) mgr: ceph-mgr01(active, since 15h), standbys: ceph-mgr02 #备用的ceph-mgr02已经添加完成 osd: 20 osds: 16 up (since 34m), 15 in (since 37m) data: pools: 1 pools, 1 pgs objects: 0 objects, 0 B usage: 134 MiB used, 300 GiB / 300 GiB avail pgs: 1 active+clean