Corosync+pacemaker+DRBD+mysql(mariadb)实现高可用(ha)的mysql集群(centos7)
目录:
- 介绍环境
- 借助pcs安装与配置corosync和pacemaker(pcs只是一个管理工具)
- DRBD安装配置参考之前的博客《DRBD-MYSQL分布式块设备实现高可用》
- Mysql安装和配置
- Crmsh安装和资源管理
据说这种Corosync+pacemaker+DRBD+mysql(mariadb)的架构可以达到99.99%,就是说这种架构非常的稳定,所以觉得还是需要了解一下的。
先给大家介绍一下高可用的衡量标准:高可用性集群可以通过系统可靠性(reliability)和可维护性(maintainablilty)来度量的,工程上,通常使用平均故障时间(MTTF)来度量系统的可靠性,用平均维修时间(MTTR)来度量系统的可维护性。于是可靠性被定义为:HA=MTTF/(MTTF+MTTR)*100% 具体HA衡量标准:
99%---一年宕机时间不超过4天
99,.9%---一年宕机时间不超过10小时
99.99%---一年宕机时间不超过1小时
99.999%---一年宕机时间不超过6分钟
环境:
[root@cml1 ~]# cat /etc/redhat-release
CentOS Linux release 7.2.1511 (Core)
[root@cml2 ~]# cat /etc/redhat-release
CentOS Linux release 7.2.1511 (Core)
cml1:serve ip:192.168.5.101r vip:192.168.5.250
cml2:server ip:192.168.5.102 vip:192.168.5.250
cml3:client ip:192.168.5.104
二、借助pcs安装与配置corosync和pacemaker(pcs只是一个管理工具)
配置集群的前提:
(1)时间同步
[root@cml1~]# ntpdate cn.pool.ntp.org
[root@cml2~]# ntpdate cn.pool.ntp.org
(2)主机名互相访问
[root@cml1~]# ssh-keygen
[root@cml1~]# ssh-copy-id cml2
[root@cml1~]# hostname
cml1
[root@cml1~]# cat /etc/hosts
192.168.5.101 cml1
192.168.5.102 cml2
192.168.5.104 cml3
(3)是否使用仲裁设备。
Centos7上面不需要使用
生命周期管理工具:
Pcs:agent(pcsd)
Crash:pssh
1、在两节点上执行:
[root@cml1 ~]# yum install -y pacemaker pcspsmisc policycoreutils-python
2、两节点上启动pcs并且开机启动:
[root@cml1 ~]# systemctl start pcsd.service [root@cml1 ~]# systemctl enable pcsd.service
3、两节点上修改用户hacluster的密码(用户已经固定不可以改变)
[root@cml1 ~]# echo redhat | passwd --stdin hacluster
4、注册pcs集群主机(默认注册使用用户名hacluster,和密码):
[root@cml1 corosync]# pcs cluster auth cml1 cml2 ##设置注册那个集群节点 cml1: Already authorized cml2: Already authorized
5、在集群上注册两台集群:
[root@cml1 corosync]# pcs cluster setup--name mycluster cml1 cml2 --force。 ##设置集群
6、接下来就在某个节点上已经生成来corosync配置文件:
[root@cml1 corosync]# ls
corosync.conf corosync.conf.example corosync.conf.example.udpu corosync.xml.example uidgid.d
#我们看到生成来corosync.conf配置文件:
7、我们看一下注册进来的文件:
[root@cml1 corosync]# cat corosync.conf
totem {
version: 2
secauth: off
cluster_name: webcluster
transport: udpu
}
nodelist {
node {
ring0_addr: cml1
nodeid: 1
}
node {
ring0_addr: cml2
nodeid: 2
}
}
quorum {
provider: corosync_votequorum
two_node: 1
}
logging {
to_logfile: yes
logfile: /var/log/cluster/corosync.log
to_syslog: yes
}
8、启动集群:
[root@cml1 corosync]# pcs cluster start--all
cml1: Starting Cluster...
cml2: Starting Cluster...
##相当于启动来pacemaker和corosync:
[root@cml1 corosync]# ps -ef | grepcorosync
root 57490 1 1 21:47 ? 00:00:52 corosync
root 75893 51813 0 23:12 pts/0 00:00:00 grep --color=auto corosync
[root@cml1 corosync]# ps -ef | greppacemaker
root 57502 1 0 21:47 ? 00:00:00 /usr/sbin/pacemakerd -f
haclust+ 57503 57502 0 21:47 ? 00:00:03 /usr/libexec/pacemaker/cib
root 57504 57502 0 21:47 ? 00:00:00/usr/libexec/pacemaker/stonithd
root 57505 57502 0 21:47 ? 00:00:01 /usr/libexec/pacemaker/lrmd
haclust+ 57506 57502 0 21:47 ? 00:00:01 /usr/libexec/pacemaker/attrd
haclust+ 57507 57502 0 21:47 ? 00:00:00 /usr/libexec/pacemaker/pengine
haclust+ 57508 57502 0 21:47 ? 00:00:01 /usr/libexec/pacemaker/crmd
root 75938 51813 0 23:12 pts/0 00:00:00 grep --color=auto pacemaker
8、查看集群的状态(显示为no faults就是ok)
[root@cml1 corosync]# corosync-cfgtool -s
Printing ring status.
Local node ID 1
RING ID 0
id = 192.168.5.101
status = ring 0 active with no faults
[root@cml1 corosync]# ssh cml2corosync-cfgtool -s
Printing ring status.
Local node ID 2
RING ID 0
id = 192.168.5.102
status = ring 0 active with no faults
10、可以查看集群是否有错:
[root@cml1 corosync]# crm_verify -L -V
error: unpack_resources: Resourcestart-up disabled since no STONITH resources have been defined
error: unpack_resources: Either configure some or disable STONITH with the stonith-enabled option
error: unpack_resources: NOTE:Clusters with shared data need STONITH to ensure data integrity
Errors found during check: config not valid
##因为我们没有配置STONITH设备,所以我们下面要关闭
11、关闭STONITH设备:
[root@cml1 corosync]# pcs property setstonith-enabled=false
[root@cml1 corosync]# crm_verify -L -V
[root@cml1 corosync]# pcs property list
Cluster Properties:
cluster-infrastructure: corosync
cluster-name: mycluster
dc-version: 1.1.16-12.el7_4.2-94ff4df
have-watchdog: false
stonith-enabled: false
三、DRBD安装配置
四、Mysql安装和配置
##这里我是使用yum直接安装的,如果使用编译安装也是一样的但是就要在
[
root@cml1 ~]# cd /etc/systemd/system
[root@cml1 system]# cat mariadb.service
[Unit]
Description=mariadb
After=network.target
[Service]
Type=forking
ExecStart=/usr/local/mysql/sbin/mysql
ExecReload=/usr/local/mysql/sbin/mysql -sreload
ExecStop=/usr/local/mysql/sbin/mysql -squit
PrivateTmp=true
[Install]
WantedBy=multi-user.target
1、两节点上面安装:
[root@cml1 ~]# yum install mariadb-servermariadb -y
2、在/etc/my.cnf文件中修改mysql数据库存放目录为/data
[root@cml1 ~]# cat /etc/my.cnf
[mysqld]
datadir=/data
socket=/var/lib/mysql/mysql.sock
# Disabling symbolic-links is recommendedto prevent assorted security risks
symbolic-links=0
# Settings user and group are ignored whensystemd is used.
# If you need to run mysqld under adifferent user or group,
# customize your systemd unit file formariadb according to the
# instructions inhttp://fedoraproject.org/wiki/Systemd
[mysqld_safe]
log-error=/var/log/mariadb/mariadb.log
pid-file=/var/run/mariadb/mariadb.pid
#
# include all files from the configdirectory
#
!includedir /etc/my.cnf.d
##先不要启动因为后面直接使用crm配置启动。
五、Crmsh安装和资源管理
1、安装crmsh:
集群我们可以下载安装crmsh来操作(从github来下载,然后解压直接安装):只在一个节点安装即可。(但最好选择两节点上安装这样测试时方便点)
[root@cml1 ~]# cd /usr/local/src/
[root@cml1 src]# ls
nginx-1.12.0 php-5.5.38.tar.gz
crmsh-2.3.2.tar nginx-1.12.0.tar.gz zabbix-3.2.7.tar.gz
[root@cml1 src]# tar -xf crmsh-2.3.2.tar
[root@cml1 crmsh-2.3.2]# python setup.pyinstall
2、用crmsh来管理:
[root@cml1 ~]# crm help
Help overview for crmsh
Available topics:
Overview Help overview forcrmsh
Topics Available topics
Description Program description
CommandLine Command lineoptions
Introduction Introduction
Interface User interface
Completion Tab completion
Shorthand Shorthand syntax
Features Features
Shadows Shadow CIB usage
Checks Configurationsemantic checks
Templates Configurationtemplates
Testing Resource testing
Security Access ControlLists (ACL)
Resourcesets Syntax: Resourcesets
AttributeListReferences Syntax: Attribute list references
AttributeReferences Syntax: Attribute references
RuleExpressions Syntax: Rule expressions
Lifetime Lifetime parameterformat
Reference Command reference
3、借助crmsh配置mysql高可用
[root@cml1 ~]# crm
crm(live)# status
Stack: corosync
Current DC: cml2 (version1.1.16-12.el7_4.4-94ff4df) - partition with quorum
Last updated: Wed Oct 25 09:22:31 2017
Last change: Wed Oct 25 09:22:19 2017 byroot via cibadmin on cml1
2 nodes configured
0 resources configured
Online: [ cml1 cml2 ]
No resources
##配置之前最好检查服务没有启动
[root@cml1 ~]# systemctl stop mariadb
[root@cml1 ~]# umount /dev/drbd1
[root@cml1 ~]# systemctl stop drbd
[root@cml1 ~]# systemctl enable mariadb ####需要设置开机启动在下面配置systemd:mariadb时才会出现。
(1)增加DRBD资源:
[root@cml1 data]# crm
crm(live)# configure
crm(live)configure# propertystonith-enabled=false
crm(live)configure# propertyno-quorum-policy=ignore
crm(live)configure# property migration-limit=1
crm(live)configure# verify
crm(live)configure# primitive mysqldrbdocf:linbit:drbd params drbd_resource=mysql op start timeout=240 op stoptimeout=100 op monitor role=Master interval=20 timeout=30 op monitor role=Slaveinterval=30 timeout=30
crm(live)configure# ms ms_mysqldrbdmysqldrbd meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1notify=true
crm(live)configure# verify
crm(live)configure# show
node 1: cml1
node 2: cml2
primitive mysqldrbd ocf:linbit:drbd \
params drbd_resource=mysql \
op start timeout=240 interval=0 \
op stop timeout=100 interval=0 \
op monitor role=Master interval=20 timeout=30 \
op monitor role=Slave interval=30 timeout=30
ms ms_mysqldrbd mysqldrbd \
meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1notify=true
property cib-bootstrap-options: \
have-watchdog=false \
dc-version=1.1.16-12.el7_4.2-94ff4df \
cluster-infrastructure=corosync \
cluster-name=mycluster \
stonith-enabled=false \
no-quorum-policy=ignore
property migration-limit=1
crm(live)configure# commit
(2)增加文件系统资源:
crm(live)configure# primitive mystore ocf:heartbeat:Filesystemparams device=/dev/drbd1 directory=/data fstype=ext4 op start timeout=60 opstop timeout=60
crm(live)configure# verify
(3)给文件系统类型和drbd做亲缘性绑定(inf为证书为接近,当位负数时候为分离)。
crm(live)configure# colocationmystore_with_ms_mysqldrbd inf: mystore ms_mysqldrbd:Master
crm(live)configure# verify
(4)做顺序约束,当drbd起来之后才对文件系统进行绑定:
crm(live)configure# ordermystore_after_ms_mysqldrbd mandatory: ms_mysqldrbd:promote mystore:start
crm(live)configure# verify
crm(live)configure# commit
(5)查看一下资源当前的配置:
crm(live)# status
Stack: corosync
Current DC: cml1 (version1.1.16-12.el7_4.2-94ff4df) - partition with quorum
Last updated: Sun Oct 22 14:18:52 2017
Last change: Sun Oct 22 14:18:05 2017 byroot via cibadmin on cml1
2 nodes configured
3 resources configured
Online: [ cml1 cml2 ]
Full list of resources:
Master/Slave Set: ms_mysqldrbd [mysqldrbd]
Masters: [ cml1 ]
Slaves: [ cml2 ]
mystore (ocf::heartbeat:Filesystem): Started cml1
##查看当前配置:
crm(live)configure# show
node 1: cml1
node 2: cml2
primitive mysqldrbd ocf:linbit:drbd \
params drbd_resource=mysql \
op start timeout=240 interval=0 \
op stop timeout=100 interval=0 \
op monitor role=Master interval=20 timeout=30 \
op monitor role=Slave interval=30 timeout=30
primitive mystore Filesystem \
params device="/dev/drbd1" directory="/data"fstype=xfs \
op start timeout=60 interval=0 \
op stop timeout=60 interval=0
ms ms_mysqldrbd mysqldrbd \
meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true
order mystore_after_ms_mysqldrbd Mandatory:ms_mysqldrbd:promote mystore:start
colocation mystore_with_ms_mysqldrbd inf:mystore ms_mysqldrbd:Master
property cib-bootstrap-options: \
have-watchdog=false \
dc-version=1.1.16-12.el7_4.2-94ff4df \
cluster-infrastructure=corosync \
cluster-name=mycluster \
stonith-enabled=false \
no-quorum-policy=ignore
property migration-limit=1
##查看下cml1你应该会发现已经挂载了/data目录:
[root@cml1 ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/cl-root 17G 9.0G 8.1G 53% /
devtmpfs 588M 0 588M 0% /dev
tmpfs 599M 39M 560M 7% /dev/shm
tmpfs 599M 8.1M 591M 2% /run
tmpfs 599M 0 599M 0% /sys/fs/cgroup
/dev/sda1 1014M 168M 847M 17% /boot
tmpfs 120M 0 120M 0% /run/user/0
/dev/drbd1 5.0G 62M 5.0G 2% /data
(6)接下来增加一下mysql资源,并作亲缘性绑定:
crm(live)configure# primitive mysqldsystemd:mariadb op start timeout=100 op stop timeout=100
crm(live)configure# primitive mysqldsystemd:mariadb
crm(live)configure# colocationmysqld_with_mystore inf: mysqld mystore
crm(live)configure# verify
(7)提交并且查看:
crm(live)configure# commit
crm(live)configure# show
node 1: cml1
node 2: cml2
primitive mysqld systemd:mariadb \
op start timeout=100 interval=0 \
op stop timeout=100 interval=0
primitive mysqldrbd ocf:linbit:drbd \
params drbd_resource=mysql \
op start timeout=240 interval=0 \
op stop timeout=100 interval=0 \
op monitor role=Master interval=20 timeout=30 \
op monitor role=Slave interval=30 timeout=30
primitive mystore Filesystem \
params device="/dev/drbd1" directory="/data"fstype=xfs \
op start timeout=60 interval=0 \
op stop timeout=60 interval=0
ms ms_mysqldrbd mysqldrbd \
meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1notify=true
colocation mysqld_with_mystore inf: mysqld mystore
order mystore_after_ms_mysqldrbd Mandatory:ms_mysqldrbd:promote mystore:start
colocation mystore_with_ms_mysqldrbd inf:mystore ms_mysqldrbd:Master
property cib-bootstrap-options: \
have-watchdog=false \
dc-version=1.1.16-12.el7_4.2-94ff4df \
cluster-infrastructure=corosync \
cluster-name=mycluster \
stonith-enabled=false \
no-quorum-policy=ignore
property migration-limit=1
(8)作顺序约束,先挂载文件系统,然后在启动msyqld资源:
crm(live)configure# ordermysqld_after_mystore mandatory: mystore mysqld
crm(live)configure# verify
crm(live)configure# show
node 1: cml1
node 2: cml2
primitive mysqld systemd:mariadb \
op start timeout=100 interval=0 \
op stop timeout=100 interval=0
primitive mysqldrbd ocf:linbit:drbd \
params drbd_resource=mysql \
op start timeout=240 interval=0 \
op stop timeout=100 interval=0 \
op monitor role=Master interval=20 timeout=30 \
op monitor role=Slave interval=30 timeout=30
primitive mystore Filesystem \
params device="/dev/drbd1" directory="/data"fstype=xfs \
op start timeout=60 interval=0 \
op stop timeout=60 interval=0
ms ms_mysqldrbd mysqldrbd \
meta master-max=1 master-node-max=1clone-max=2 clone-node-max=1 notify=true
order mysqld_after_mystore Mandatory:mystore mysqld
colocation mysqld_with_mystore inf: mysqldmystore
order mystore_after_ms_mysqldrbd Mandatory:ms_mysqldrbd:promote mystore:start
colocation mystore_with_ms_mysqldrbd inf:mystore ms_mysqldrbd:Master
property cib-bootstrap-options: \
have-watchdog=false \
dc-version=1.1.16-12.el7_4.2-94ff4df \
cluster-infrastructure=corosync \
cluster-name=mycluster \
stonith-enabled=false \
no-quorum-policy=ignore
property migration-limit=1
crm(live)configure# commit
(9)检测资源,并且看一下node2的mysql是否已经启动:
crm(live)# status
Stack: corosync
Current DC: cml1 (version1.1.16-12.el7_4.4-94ff4df) - partition with quorum
Last updated: Wed Oct 25 23:45:57 2017
Last change: Wed Oct 25 23:36:20 2017 byroot via crm_attribute on cml1
2 nodes configured
5 resources configured
Online: [ cml1 cml2 ]
Full list of resources:
Master/Slave Set: ms_mysqldrbd [mysqldrbd]
Masters: [ cml1 ]
Slaves: [ cml2 ]
mystore (ocf::heartbeat:Filesystem): Started cml1
mysqld (systemd:mariadb): Started cml1
myvip (ocf::heartbeat:IPaddr): Started cml1
####我们看到mysql已经启动了:
(10)增加VIP资源,作虚拟IP调度:
crm(live)# configure
crm(live)configure# primitive myvipocf:heartbeat:IPaddr params ip="192.168.5.200" op monitor interval=20timeout=20 on-fail=restart
crm(live)configure# colocation vip_with_ms_mysqldrbd inf:ms_mysqldrbd:Master myvip
crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# show
node 1: cml1 \
attributesstandby=off
node 2: cml2 \
attributesstandby=off
primitive mysqld systemd:mariadb \
opstart timeout=100 interval=0 \
opstop timeout=100 interval=0
primitive mysqldrbd ocf:linbit:drbd \
paramsdrbd_resource=mysql \
opstart timeout=240 interval=0 \
opstop timeout=100 interval=0 \
opmonitor role=Master interval=20 timeout=30 \
opmonitor role=Slave interval=30 timeout=30
primitive mystore Filesystem \
paramsdevice="/dev/drbd1" directory="/data" fstype=ext4 \
opstart timeout=60 interval=0 \
opstop timeout=60 interval=0
primitive myvip IPaddr \
paramsip=192.168.5.250 \
opmonitor interval=20 timeout=20 on-fail=restart
ms ms_mysqldrbd mysqldrbd \
metamaster-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true
order mysqld_after_mystone Mandatory:mystore mysqld
colocation mysqld_with_mystone inf: mysqldmystore
order mystone_after_ms_mysqldrbd Mandatory:ms_mysqldrbd:promote mystore:start
colocation mystore_with_ms_mysqldrbd inf:mystore ms_mysqldrbd:Master
colocation vip_with_ms_mysqldrbd inf:ms_mysqldrbd:Master myvip
property cib-bootstrap-options: \
have-watchdog=false\
dc-version=1.1.16-12.el7_4.4-94ff4df\
cluster-infrastructure=corosync\
cluster-name=webcluster\
stonith-enabled=false\
no-quorum-policy=ignore
property migration-limit=1
六、测试:
最后所有的工作都完成之后,我们测试一下vip:
[root@cml1 ~]# df -TH
Filesystem Type Size Used Avail Use% Mounted on
/dev/mapper/centos-root xfs 19G 6.8G 13G 36% /
devtmpfs devtmpfs 501M 0 501M 0% /dev
tmpfs tmpfs 512M 143M 370M 28% /dev/shm
tmpfs tmpfs 512M 20M 493M 4% /run
tmpfs tmpfs 512M 0 512M 0% /sys/fs/cgroup
/dev/sda1 xfs 521M 161M 361M 31% /boot
tmpfs tmpfs 103M 0 103M 0% /run/user/0
/dev/drbd1 ext4 11G 69M 9.9G 1% /data
[root@cml1 ~]# mysql
Welcome to the MariaDB monitor. Commands end with ; or \g.
Your MariaDB connection id is 2
Server version: 5.5.56-MariaDB MariaDBServer
Copyright (c) 2000, 2017, Oracle, MariaDBCorporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' toclear the current input statement.
MariaDB [(none)]> show databases;
+---------------------+
| Database |
+---------------------+
| information_schema |
| cml |
| cml2 |
| cmltest |
| #mysql50#lost+found |
| mysql |
| performance_schema |
| test |
| testcml |
+---------------------+
9 rows in set (0.01 sec)
###现在操作是授权
MariaDB [(none)]> GRANT ALL ON *.* TO'root'@'%' IDENTIFIED BY '123456';
Query OK, 0 rows affected (0.02 sec)
MariaDB [(none)]> FLUSH PRIVILEGES;
Query OK, 0 rows affected (0.00 sec)
###使用客户端访问vip:
[root@cml3 ~]# mysql -uroot -p123456 -h192.168.5.250
Welcome to the MariaDB monitor. Commands end with ; or \g.
Your MariaDB connection id is 3
Server version: 5.5.56-MariaDB MariaDBServer
Copyright (c) 2000, 2017, Oracle, MariaDBCorporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' toclear the current input statement.
MariaDB [(none)]> show databases;
+---------------------+
| Database |
+---------------------+
| information_schema |
| cml |
| cml2 |
| cmltest |
| #mysql50#lost+found |
| mysql |
| performance_schema |
| test |
| testcml |
+---------------------+
9 rows in set (0.00 sec)
###下面我们把cml1设置为standby,然后在cml2访问:
crm(live)#node standby
crm(live)#status
Stack:corosync
CurrentDC: cml1(version 1.1.16-12.el7_4.2-94ff4df) - partition with quorum
Lastupdated: Sun Oct 22 14:54:14 2017
Lastchange: Sun Oct 22 14:54:02 2017 by root via crm_attribute on cml1
2 nodesconfigured
5resources configured
Node cml1:standby
Online:[ cml2 ]
Fulllist of resources:
Master/Slave Set: ms_mysqldrbd [mysqldrbd]
Masters: [ cml2 ]
Stopped: [ cml1 ]
mystore (ocf::heartbeat:Filesystem): Started cml2
mysqld (systemd:mariadb): Started cml2
myvip (ocf::heartbeat:IPaddr): Started cml2
###在cml2上面访问(我们发现vip已经漂移过来,数据也漂移了过来):
[root@cml2~]# mysql -uroot -p123456 -h 192.168.5.250
Welcometo the MariaDB monitor. Commands endwith ; or \g.
YourMariaDB connection id is 3
Serverversion: 5.5.56-MariaDB MariaDB Server
Copyright(c) 2000, 2017, Oracle, MariaDB Corporation Ab and others.
Type'help;' or '\h' for help. Type '\c' to clear the current input statement.
MariaDB[(none)]> show databases;
+---------------------+
|Database |
+---------------------+
|information_schema |
|cml |
|cml2 |
|cmltest |
|#mysql50#lost+found |
|mysql |
|performance_schema |
|test |
|testcml |
+---------------------+
9 rowsin set (0.00 sec)