前言
MySQL配置主从后,从库作为容灾或只读库使用,如果主库宕机,需要DBA手工执行主从切换,然后通知应用层修改数据库配置。
宕机时间长,需要人工干预,严重影响用户的使用。
keepalived是由C语言编写的路由软件,是lvs的扩展项目,主要用作RealServer的健康状态检查以及LoadBalance主机和BackUP主机之间failover的实现。
当出现主库宕机的时候,vip会漂到备库,因为应用是通过vip进行连接的,所以除了漂移和切换过程中会存在短暂的不可用,DBA和开发人员无用认为干预,切换完成后,应用恢复正常。
环境准备
服务器类别 | IP | 数据库版本 | |
主库 | 192.168.10.102 | 8.0 | CentOS release 7.0 |
备库 | 192.168.10.103 | 8.0 | CentOS release 7.0 |
192.168.10.240(虚拟IP) |
主从环境已经搭建完毕
虚拟IP需要同网段未使用的IP,不然会存在网络冲突
Keepalived安装
备注:主从需同时安装
下载地址:
Keepalived for Linux
链接:https://pan.baidu.com/s/1rHxokSj_8ONibYh-neIWug
提取码:yyds
--来自百度网盘超级会员V5的分享
安装步骤:
源码包编译
cd /usr/local/src
wget https://www.keepalived.org/software/keepalived-2.2.7.tar.gz
tar xvf keepalived-2.2.7.tar.gz
cd keepalived-2.2.7
./configure --prefix=/usr/local/keepalived
make && make install
去 keepalived 源码目录下拷贝启动脚本到/etc/init.d/下
cd /usr/local/src/keepalived-2.2.7
#把对应的启动脚本拷贝到/etc/init.d,方便用service命令进行操作
cp /usr/local/src/keepalived-2.2.7/keepalived/etc/init.d/keepalived /etc/init.d/keepalived
ll /etc/init.d/ | grep keepalived
修改/etc/init.d/keepalived启动脚本
/etc/init.d/keepalived
默认配置:
pidfile: /var/run/keepalived.pid
config: /etc/keepalived/keepalived.conf
#把配置文件移动到启动应用默认找的路径/etc/keepalived/keepalived.conf
mv /usr/local/keepalived/etc/keepalived/keepalived.conf.sample /etc/keepalived/keepalived.conf
服务的启动停止
service keepalived start|stop|restart
主从配置文件
keepalived配置文件
模板
global_defs {
#主机标识符,主从可以相同也可以不相同
router_id MySQL-HA
}
vrrp_script check_run {
#文件用以检测MySQL服务是否正常,当发现连接不上mysql,自动把keepalived进程杀掉,让VIP进行漂移。
script "/home/mysql/mysql_check.sh"
interval 60
}
vrrp_sync_group VG1 {
group {
VI_1
}
}
vrrp_instance VI_1 {
#表示keepalived角色,都是设成BACKUP则以优先级为主要参考
state BACKUP
#指定HA监听的网络接口 ifconfig看到的网络接口名,此处是ens33
interface ens33
#虚拟路由标识,取值0-255,master-1和master-2保持一致
virtual_router_id 51
#优先级,用来选举master,取值范围1-255
priority 100
#发VRRP包时间间隔,即多久进行一次master选举
advert_int 1
#不抢占,即允许一个priority值比较低的一个节点作为master
nopreempt
#执行对应的检查脚本
track_script {
check_run
}
#/home/mysql/master.sh的作用是状态改为master以后执行的脚本。首先判断复制是否有延迟,如果有延迟,等1分钟后,不论是否有延迟,都并停止复制,并且记录binlog和pos点。
notify_master /home/mysql/master.sh
#/home/mysql/stop.sh表示Keepalived停止以后需要执行的脚本。检查是否还有复制写入操作,最后无论是否执行完毕都退出。
notify_stop /home/mysql/stop.sh
virtual_ipaddress {
192.168.10.240
}
}
主库配置文件
vi /etc/keepalived/keepalived.conf
global_defs {
router_id MySQL-HA
}
vrrp_script check_run {
script "/home/mysql/mysql_check.sh"
interval 60
}
vrrp_sync_group VG1 {
group {
VI_1
}
}
vrrp_instance VI_1 {
state BACKUP
interface ens33
virtual_router_id 51
priority 100
advert_int 1
nopreempt
track_script {
check_run
}
notify_master /home/mysql/master.sh
notify_stop /home/mysql/stop.sh
virtual_ipaddress {
192.168.10.240
}
}
从库配置文件
global_defs {
router_id MySQL-HA
}
vrrp_script check_run {
script "/home/mysql/mysql_check.sh"
interval 60
}
vrrp_sync_group VG1 {
group {
VI_1
}
}
vrrp_instance VI_1 {
state BACKUP
interface ens33
virtual_router_id 51
priority 90
advert_int 1
nopreempt
track_script {
check_run
}
notify_master /home/mysql/master.sh
notify_stop /home/mysql/stop.sh
virtual_ipaddress {
192.168.10.240
}
}
master与slave的keepalived配置文件中只有priority设置不同,master为100,slave为90,其它全一样。配置文件是以块形式组织的,每个块都在{}包围的范围内,#和!开头的行都是注释。
keepalived配置文件参数介绍
检查及启停脚本
主备两边的脚本相同
主库主要执行的操作
/home/mysql/mysql_check.sh文件用以检测MySQL服务是否正常,当发现连接不上mysql,自动把keepalived进程杀掉,让VIP进行漂移。
从库切换以后要执行的操作
/home/mysql/master.sh的作用是状态改为master以后执行的脚本。首先判断复制是否有延迟,如果有延迟,等1分钟后,不论是否有延迟,都并停止复制,并且记录binlog和pos点。
主库主要执行的操作
/home/mysql/stop.sh表示Keepalived停止以后需要执行的脚本。检查是否还有复制写入操作,最后无论是否执行完毕都退出。
vi /home/mysql/mysql_check.sh
#!/bin/bash
count=1
while true
do
mysql -utest -pMysql@123 -S /var/lib/mysql/mysql.sock -e "show status;" > /dev/null 2>&1
i=$?
ps aux | grep mysqld | grep -v grep > /dev/null 2>&1
j=$?
if [ $i = 0 ] && [ $j = 0 ]
then
exit 0
else
if [ $i = 1 ] && [ $j = 0 ]
then
exit 0
else
if [ $count -gt 5 ]
then
break
fi
let count++
continue
fi
fi
done
/etc/init.d/keepalived stop
vi /home/mysql/master.sh
#!/bin/bash
Master_Log_File=$(mysql -utest -pMysql@123 -S /var/lib/mysql/mysql.sock -e "show slave status\G" | grep -w Master_Log_File | awk -F": " '{print $2}')
Relay_Master_Log_File=$(mysql -utest -pMysql@123 -S /var/lib/mysql/mysql.sock -e "show slave status\G" | grep -w Relay_Master_Log_File | awk -F": " '{print $2}')
Read_Master_Log_Pos=$(mysql -utest -pMysql@123 -S /var/lib/mysql/mysql.sock -e "show slave status\G" | grep -w Read_Master_Log_Pos | awk -F": " '{print $2}')
Exec_Master_Log_Pos=$(mysql -utest -pMysql@123 -S /var/lib/mysql/mysql.sock -e "show slave status\G" | grep -w Exec_Master_Log_Pos | awk -F": " '{print $2}')
i=1
while true
do
if [ $Master_Log_File = $Relay_Master_Log_File ] && [ $Read_Master_Log_Pos -eq $Exec_Master_Log_Pos ]
then
echo "ok"
break
else
sleep 1
if [ $i -gt 60 ]
then
break
fi
continue
let i++
fi
done
mysql -utest -pMysql@123 -S /var/lib/mysql/mysql.sock -e "stop slave;"
mysql -utest -pMysql@123 -S /var/lib/mysql/mysql.sock -e "reset slave all;"
mysql -utest -pMysql@123 -S /var/lib/mysql/mysql.sock -e "show master status;" > /tmp/master_status_$(date "+%y%m%d-%H%M").txt
vi /home/mysql/stop.sh
#!/bin/bash
M_File1=$(mysql -utest -pMysql@123 -S /var/lib/mysql/mysql.sock -e "show master status\G" | awk -F': ' '/File/{print $2}')
M_Position1=$(mysql -utest -pMysql@123 -S /var/lib/mysql/mysql.sock -e "show master status\G" | awk -F': ' '/Position/{print $2}')
sleep 1
M_File2=$(mysql -utest -pMysql@123 -S /var/lib/mysql/mysql.sock -e "show master status\G" | awk -F': ' '/File/{print $2}')
M_Position2=$(mysql -utest -pMysql@123 -S /var/lib/mysql/mysql.sock -e "show master status\G" | awk -F': ' '/Position/{print $2}')
i=1
while true
do
if [ $M_File1 = $M_File1 ] && [ $M_Position1 -eq $M_Position2 ]
then
echo "ok"
break
else
sleep 1
if [ $i -gt 60 ]
then
break
fi
continue
let i++
fi
done
启动keepalived
service keepalived start|stop|restart
主库查看vip是否绑定在eth2网卡上
如果未成功,查看日志,并调整配置文件
[root@mater ~]# tail -100f /var/log/messages
Aug 22 20:50:53 mater Keepalived_vrrp[5284]: Registering Kernel netlink reflector
Aug 22 20:50:53 mater Keepalived_vrrp[5284]: Registering Kernel netlink command channel
Aug 22 20:50:53 mater Keepalived_healthcheckers[5283]: Netlink reflector reports IP 192.168.148.143 added
Aug 22 20:50:53 mater Keepalived_healthcheckers[5283]: Netlink reflector reports IP fe80::20c:29ff:feea:2f7 added
Aug 22 20:50:53 mater Keepalived_healthcheckers[5283]: Registering Kernel netlink reflector
Aug 22 20:50:53 mater Keepalived_healthcheckers[5283]: Registering Kernel netlink command channel
Aug 22 20:50:53 mater Keepalived_vrrp[5284]: Registering gratuitous ARP shared channel
Aug 22 20:51:13 mater Keepalived_vrrp[5284]: Opening file '/etc/keepalived/keepalived.conf'.
Aug 22 20:51:13 mater Keepalived_vrrp[5284]: Configuration is using : 66310 Bytes
Aug 22 20:51:13 mater Keepalived_vrrp[5284]: Using LinkWatch kernel netlink reflector...
Aug 22 20:51:13 mater Keepalived_healthcheckers[5283]: Opening file '/etc/keepalived/keepalived.conf'.
Aug 22 20:51:13 mater Keepalived_healthcheckers[5283]: Configuration is using : 6333 Bytes
Aug 22 20:51:13 mater Keepalived_vrrp[5284]: VRRP_Instance(VI_1) Entering BACKUP STATE
Aug 22 20:51:13 mater Keepalived_vrrp[5284]: VRRP sockpool: [ifindex(2), proto(112), unicast(0), fd(10,11)]
Aug 22 20:51:13 mater Keepalived_healthcheckers[5283]: Using LinkWatch kernel netlink reflector...
Aug 22 20:51:13 mater Keepalived[5281]: Stopping Keepalived v1.2.13 (08/22,2020)
Aug 22 20:52:07 mater Keepalived[5364]: Starting Keepalived v1.2.13 (08/22,2020)
Aug 22 20:52:07 mater Keepalived[5365]: Starting Healthcheck child process, pid=5367
Aug 22 20:52:07 mater Keepalived[5365]: Starting VRRP child process, pid=5368
Aug 22 20:52:07 mater Keepalived_vrrp[5368]: Netlink reflector reports IP 192.168.148.143 added
Aug 22 20:52:07 mater Keepalived_vrrp[5368]: Netlink reflector reports IP fe80::20c:29ff:feea:2f7 added
Aug 22 20:52:07 mater Keepalived_vrrp[5368]: Registering Kernel netlink reflector
Aug 22 20:52:07 mater Keepalived_vrrp[5368]: Registering Kernel netlink command channel
Aug 22 20:52:07 mater Keepalived_vrrp[5368]: Registering gratuitous ARP shared channel
Aug 22 20:52:07 mater Keepalived_healthcheckers[5367]: Netlink reflector reports IP 192.168.148.143 added
Aug 22 20:52:07 mater Keepalived_healthcheckers[5367]: Netlink reflector reports IP fe80::20c:29ff:feea:2f7 added
Aug 22 20:52:07 mater Keepalived_healthcheckers[5367]: Registering Kernel netlink reflector
Aug 22 20:52:07 mater Keepalived_healthcheckers[5367]: Registering Kernel netlink command channel
Aug 22 20:52:27 mater Keepalived_vrrp[5368]: Opening file '/etc/keepalived/keepalived.conf'.
Aug 22 20:52:27 mater Keepalived_vrrp[5368]: Configuration is using : 66310 Bytes
Aug 22 20:52:27 mater Keepalived_vrrp[5368]: Using LinkWatch kernel netlink reflector...
Aug 22 20:52:27 mater Keepalived_healthcheckers[5367]: Opening file '/etc/keepalived/keepalived.conf'.
Aug 22 20:52:27 mater Keepalived_healthcheckers[5367]: Configuration is using : 6333 Bytes
Aug 22 20:52:27 mater Keepalived_healthcheckers[5367]: Using LinkWatch kernel netlink reflector...
Aug 22 20:52:27 mater Keepalived_vrrp[5368]: VRRP_Instance(VI_1) Entering BACKUP STATE
Aug 22 20:52:27 mater Keepalived_vrrp[5368]: VRRP sockpool: [ifindex(2), proto(112), unicast(0), fd(10,11)]
Aug 22 20:52:27 mater Keepalived_vrrp[5368]: VRRP_Script(check_run) succeeded
Aug 22 20:52:30 mater Keepalived_vrrp[5368]: VRRP_Instance(VI_1) Transition to MASTER STATE
Aug 22 20:52:30 mater Keepalived_vrrp[5368]: VRRP_Group(VG1) Syncing instances to MASTER state
Aug 22 20:52:31 mater Keepalived_vrrp[5368]: VRRP_Instance(VI_1) Entering MASTER STATE
Aug 22 20:52:31 mater Keepalived_vrrp[5368]: VRRP_Instance(VI_1) setting protocol VIPs.
Aug 22 20:52:31 mater Keepalived_vrrp[5368]: VRRP_Instance(VI_1) Sending gratuitous ARPs on eth2 for 192.168.148.240
Aug 22 20:52:31 mater Keepalived_healthcheckers[5367]: Netlink reflector reports IP 192.168.148.240 added
Aug 22 20:52:33 mater ntpd[1532]: Listen normally on 6 eth2 192.168.148.240 UDP 123
Aug 22 20:52:33 mater ntpd[1532]: peers refreshed
Aug 22 20:52:36 mater Keepalived_vrrp[5368]: VRRP_Instance(VI_1) Sending gratuitous ARPs on eth2 for 192.168.148.240
Aug 22 21:00:11 mater dhclient[1386]: DHCPREQUEST on eth2 to 192.168.148.254 port 67 (xid=0x41f5cfe)
Aug 22 21:00:11 mater dhclient[1386]: DHCPACK from 192.168.148.254 (xid=0x41f5cfe)
Aug 22 21:00:11 mater dhclient[1386]: bound to 192.168.148.143 -- renewal in 753 seconds.
Aug 22 21:00:11 mater NetworkManager[1336]: <info> (eth2): DHCPv4 state changed renew -> renew
Aug 22 21:00:11 mater NetworkManager[1336]: <info> address 192.168.148.143
Aug 22 21:00:11 mater NetworkManager[1336]: <info> prefix 24 (255.255.255.0)
Aug 22 21:00:11 mater NetworkManager[1336]: <info> gateway 192.168.148.2
Aug 22 21:00:11 mater NetworkManager[1336]: <info> nameserver '192.168.148.2'
Aug 22 21:00:11 mater NetworkManager[1336]: <info> domain name 'localdomain'
Aug 22 21:12:44 mater dhclient[1386]: DHCPREQUEST on eth2 to 192.168.148.254 port 67 (xid=0x41f5cfe)
Aug 22 21:12:44 mater dhclient[1386]: DHCPACK from 192.168.148.254 (xid=0x41f5cfe)
Aug 22 21:12:44 mater dhclient[1386]: bound to 192.168.148.143 -- renewal in 740 seconds.
Aug 22 21:12:44 mater NetworkManager[1336]: <info> (eth2): DHCPv4 state changed renew -> renew
Aug 22 21:12:44 mater NetworkManager[1336]: <info> address 192.168.148.143
Aug 22 21:12:44 mater NetworkManager[1336]: <info> prefix 24 (255.255.255.0)
Aug 22 21:12:44 mater NetworkManager[1336]: <info> gateway 192.168.148.2
Aug 22 21:12:44 mater NetworkManager[1336]: <info> nameserver '192.168.148.2'
Aug 22 21:12:44 mater NetworkManager[1336]: <info> domain name 'localdomain'
Aug 22 21:25:04 mater dhclient[1386]: DHCPREQUEST on eth2 to 192.168.148.254 port 67 (xid=0x41f5cfe)
Aug 22 21:25:04 mater dhclient[1386]: DHCPACK from 192.168.148.254 (xid=0x41f5cfe)
Aug 22 21:25:04 mater NetworkManager[1336]: <info> (eth2): DHCPv4 state changed renew -> renew
Aug 22 21:25:04 mater NetworkManager[1336]: <info> address 192.168.148.143
Aug 22 21:25:04 mater NetworkManager[1336]: <info> prefix 24 (255.255.255.0)
Aug 22 21:25:04 mater NetworkManager[1336]: <info> gateway 192.168.148.2
Aug 22 21:25:04 mater NetworkManager[1336]: <info> nameserver '192.168.148.2'
Aug 22 21:25:04 mater NetworkManager[1336]: <info> domain name 'localdomain'
Aug 22 21:25:04 mater dhclient[1386]: bound to 192.168.148.143 -- renewal in 859 seconds.
Aug 22 21:39:23 mater dhclient[1386]: DHCPREQUEST on eth2 to 192.168.148.254 port 67 (xid=0x41f5cfe)
Aug 22 21:39:23 mater dhclient[1386]: DHCPACK from 192.168.148.254 (xid=0x41f5cfe)
Aug 22 21:39:23 mater NetworkManager[1336]: <info> (eth2): DHCPv4 state changed renew -> renew
Aug 22 21:39:23 mater NetworkManager[1336]: <info> address 192.168.148.143
Aug 22 21:39:23 mater NetworkManager[1336]: <info> prefix 24 (255.255.255.0)
Aug 22 21:39:23 mater NetworkManager[1336]: <info> gateway 192.168.148.2
Aug 22 21:39:23 mater NetworkManager[1336]: <info> nameserver '192.168.148.2'
Aug 22 21:39:23 mater NetworkManager[1336]: <info> domain name 'localdomain'
Aug 22 21:39:23 mater dhclient[1386]: bound to 192.168.148.143 -- renewal in 728 seconds.
Aug 22 21:51:31 mater dhclient[1386]: DHCPREQUEST on eth2 to 192.168.148.254 port 67 (xid=0x41f5cfe)
Aug 22 21:51:31 mater dhclient[1386]: DHCPACK from 192.168.148.254 (xid=0x41f5cfe)
Aug 22 21:51:31 mater NetworkManager[1336]: <info> (eth2): DHCPv4 state changed renew -> renew
Aug 22 21:51:31 mater NetworkManager[1336]: <info> address 192.168.148.143
Aug 22 21:51:31 mater NetworkManager[1336]: <info> prefix 24 (255.255.255.0)
Aug 22 21:51:31 mater NetworkManager[1336]: <info> gateway 192.168.148.2
Aug 22 21:51:31 mater NetworkManager[1336]: <info> nameserver '192.168.148.2'
Aug 22 21:51:31 mater NetworkManager[1336]: <info> domain name 'localdomain'
Aug 22 21:51:31 mater dhclient[1386]: bound to 192.168.148.143 -- renewal in 843 seconds.
Aug 22 22:05:34 mater dhclient[1386]: DHCPREQUEST on eth2 to 192.168.148.254 port 67 (xid=0x41f5cfe)
Aug 22 22:05:34 mater dhclient[1386]: DHCPACK from 192.168.148.254 (xid=0x41f5cfe)
Aug 22 22:05:34 mater dhclient[1386]: bound to 192.168.148.143 -- renewal in 870 seconds.
Aug 22 22:05:34 mater NetworkManager[1336]: <info> (eth2): DHCPv4 state changed renew -> renew
Aug 22 22:05:34 mater NetworkManager[1336]: <info> address 192.168.148.143
Aug 22 22:05:34 mater NetworkManager[1336]: <info> prefix 24 (255.255.255.0)
Aug 22 22:05:34 mater NetworkManager[1336]: <info> gateway 192.168.148.2
Aug 22 22:05:34 mater NetworkManager[1336]: <info> nameserver '192.168.148.2'
Aug 22 22:05:34 mater NetworkManager[1336]: <info> domain name 'localdomain'
测试
通过vip远程登录mysql
通过vip远程登录mysql
建表并录入数据
mysql> create table t2(id int,name varchar(100));
Query OK, 0 rows affected (0.07 sec)
mysql>
mysql> insert into t2 values (1,'abc');
Query OK, 1 row affected (0.02 sec)
模拟mysql 主库 crash
主库执行
pkill -9 mysqld
[root@mater mysql]# pkill -9 mysqld
[root@mater mysql]#
[root@mater mysql]#
[root@mater mysql]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:ea:02:f7 brd ff:ff:ff:ff:ff:ff
inet 192.168.148.143/24 brd 192.168.148.255 scope global eth2
inet6 fe80::20c:29ff:feea:2f7/64 scope link
valid_lft forever preferred_lft forever
查看vip是否飘到备库
[root@candicate ~]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:05:64:83 brd ff:ff:ff:ff:ff:ff
inet 192.168.148.144/24 brd 192.168.148.255 scope global eth2
inet 192.168.148.240/32 scope global eth2
inet6 fe80::20c:29ff:fe05:6483/64 scope link
valid_lft forever preferred_lft forever
[root@candicate ~]#
此时客户端正常连接
查看切换的时候的binlog位置
[root@candicate tmp]# more /tmp/master_status_200822-2249.txt
File Position Binlog_Do_DB Binlog_Ignore_DB Executed_Gtid_Set
binlog.000001 120
切换后的处理
切换后,排查主库问题,启动主库,然后将主库变成新主库(原从库)的从库
测试过,当新主库异常关闭后,会自动切换到新的从库