前言

MySQL配置主从后,从库作为容灾或只读库使用,如果主库宕机,需要DBA手工执行主从切换,然后通知应用层修改数据库配置。
宕机时间长,需要人工干预,严重影响用户的使用。

keepalived是由C语言编写的路由软件,是lvs的扩展项目,主要用作RealServer的健康状态检查以及LoadBalance主机和BackUP主机之间failover的实现。

当出现主库宕机的时候,vip会漂到备库,因为应用是通过vip进行连接的,所以除了漂移和切换过程中会存在短暂的不可用,DBA和开发人员无用认为干预,切换完成后,应用恢复正常。

环境准备

服务器类别

IP

数据库版本

主库

192.168.10.102

8.0

CentOS release 7.0

备库

192.168.10.103

8.0

CentOS release 7.0

192.168.10.240(虚拟IP)

主从环境已经搭建完毕
虚拟IP需要同网段未使用的IP,不然会存在网络冲突 

Keepalived安装 

备注:主从需同时安装
下载地址:
Keepalived for Linux

链接:https://pan.baidu.com/s/1rHxokSj_8ONibYh-neIWug 
提取码:yyds 
--来自百度网盘超级会员V5的分享 

安装步骤:

源码包编译

cd /usr/local/src
wget https://www.keepalived.org/software/keepalived-2.2.7.tar.gz
tar xvf keepalived-2.2.7.tar.gz
cd keepalived-2.2.7
./configure --prefix=/usr/local/keepalived
make && make install

去 keepalived 源码目录下拷贝启动脚本到/etc/init.d/下

cd /usr/local/src/keepalived-2.2.7
#把对应的启动脚本拷贝到/etc/init.d,方便用service命令进行操作
cp /usr/local/src/keepalived-2.2.7/keepalived/etc/init.d/keepalived /etc/init.d/keepalived
ll /etc/init.d/ | grep keepalived

修改/etc/init.d/keepalived启动脚本

/etc/init.d/keepalived

默认配置:

pidfile: /var/run/keepalived.pid
config: /etc/keepalived/keepalived.conf
#把配置文件移动到启动应用默认找的路径/etc/keepalived/keepalived.conf
mv  /usr/local/keepalived/etc/keepalived/keepalived.conf.sample  /etc/keepalived/keepalived.conf

服务的启动停止

service keepalived start|stop|restart

主从配置文件 

keepalived配置文件

模板

global_defs {
   #主机标识符,主从可以相同也可以不相同
   router_id MySQL-HA
} 

vrrp_script check_run {
#文件用以检测MySQL服务是否正常,当发现连接不上mysql,自动把keepalived进程杀掉,让VIP进行漂移。
script "/home/mysql/mysql_check.sh"
interval 60
}

vrrp_sync_group VG1 {
group {
VI_1
}
}

vrrp_instance VI_1 {
    #表示keepalived角色,都是设成BACKUP则以优先级为主要参考
    state BACKUP
	#指定HA监听的网络接口 ifconfig看到的网络接口名,此处是ens33
    interface ens33
	#虚拟路由标识,取值0-255,master-1和master-2保持一致
    virtual_router_id 51
	#优先级,用来选举master,取值范围1-255
    priority 100  
	#发VRRP包时间间隔,即多久进行一次master选举
    advert_int 1
	#不抢占,即允许一个priority值比较低的一个节点作为master
    nopreempt
	#执行对应的检查脚本
    track_script {
    check_run
    }
    #/home/mysql/master.sh的作用是状态改为master以后执行的脚本。首先判断复制是否有延迟,如果有延迟,等1分钟后,不论是否有延迟,都并停止复制,并且记录binlog和pos点。
    notify_master /home/mysql/master.sh
    #/home/mysql/stop.sh表示Keepalived停止以后需要执行的脚本。检查是否还有复制写入操作,最后无论是否执行完毕都退出。
    notify_stop /home/mysql/stop.sh

    virtual_ipaddress {
        192.168.10.240
    }
}

主库配置文件

vi /etc/keepalived/keepalived.conf
global_defs {
   router_id MySQL-HA
} 

vrrp_script check_run {
script "/home/mysql/mysql_check.sh"
interval 60
}

vrrp_sync_group VG1 {
group {
VI_1
}
}

vrrp_instance VI_1 {
    state BACKUP
    interface ens33
    virtual_router_id 51
    priority 100  
    advert_int 1
    nopreempt
    track_script {
    check_run
    }
    notify_master /home/mysql/master.sh
    notify_stop /home/mysql/stop.sh

    virtual_ipaddress {
        192.168.10.240
    }
}

从库配置文件

global_defs {
   router_id MySQL-HA
} 

vrrp_script check_run {
script "/home/mysql/mysql_check.sh"
interval 60
}

vrrp_sync_group VG1 {
group {
VI_1
}
}

vrrp_instance VI_1 {
    state BACKUP
    interface ens33
    virtual_router_id 51
    priority 90  
    advert_int 1
    nopreempt
    track_script {
    check_run
    }
    notify_master /home/mysql/master.sh
    notify_stop /home/mysql/stop.sh

    virtual_ipaddress {
        192.168.10.240
    }
}

 

master与slave的keepalived配置文件中只有priority设置不同,master为100,slave为90,其它全一样。配置文件是以块形式组织的,每个块都在{}包围的范围内,#和!开头的行都是注释。 

keepalived配置文件参数介绍

MySQL 如何保证高可用 mysql高可用keepalived_数据库

检查及启停脚本 

主备两边的脚本相同

主库主要执行的操作

/home/mysql/mysql_check.sh文件用以检测MySQL服务是否正常,当发现连接不上mysql,自动把keepalived进程杀掉,让VIP进行漂移。

从库切换以后要执行的操作

/home/mysql/master.sh的作用是状态改为master以后执行的脚本。首先判断复制是否有延迟,如果有延迟,等1分钟后,不论是否有延迟,都并停止复制,并且记录binlog和pos点。

主库主要执行的操作 

/home/mysql/stop.sh表示Keepalived停止以后需要执行的脚本。检查是否还有复制写入操作,最后无论是否执行完毕都退出。

vi /home/mysql/mysql_check.sh
#!/bin/bash
count=1

while true
do

mysql -utest -pMysql@123 -S /var/lib/mysql/mysql.sock -e "show status;" > /dev/null 2>&1
i=$?
ps aux | grep mysqld | grep -v grep > /dev/null 2>&1
j=$?
if [ $i = 0 ] && [ $j = 0 ]
then
   exit 0
else
   if [ $i = 1 ] && [ $j = 0 ]
   then
       exit 0
   else
        if [ $count -gt 5 ]
        then
              break
        fi
   let count++
   continue
   fi
fi

done

/etc/init.d/keepalived stop
vi /home/mysql/master.sh
#!/bin/bash

Master_Log_File=$(mysql -utest -pMysql@123 -S /var/lib/mysql/mysql.sock -e "show slave status\G" | grep -w Master_Log_File | awk -F": " '{print $2}')
Relay_Master_Log_File=$(mysql -utest -pMysql@123 -S /var/lib/mysql/mysql.sock -e "show slave status\G" | grep -w Relay_Master_Log_File | awk -F": " '{print $2}')
Read_Master_Log_Pos=$(mysql -utest -pMysql@123 -S /var/lib/mysql/mysql.sock -e "show slave status\G" | grep -w Read_Master_Log_Pos | awk -F": " '{print $2}')
Exec_Master_Log_Pos=$(mysql -utest -pMysql@123 -S /var/lib/mysql/mysql.sock -e "show slave status\G" | grep -w Exec_Master_Log_Pos | awk -F": " '{print $2}')

i=1

while true
do

if [ $Master_Log_File = $Relay_Master_Log_File ] && [ $Read_Master_Log_Pos -eq $Exec_Master_Log_Pos ]
then
   echo "ok"
   break
else
   sleep 1

   if [ $i -gt 60 ]
   then
      break
   fi
   continue
   let i++
fi
done

mysql -utest -pMysql@123 -S /var/lib/mysql/mysql.sock -e "stop slave;"
mysql -utest -pMysql@123 -S /var/lib/mysql/mysql.sock -e "reset slave all;"
mysql -utest -pMysql@123 -S /var/lib/mysql/mysql.sock -e "show master status;" > /tmp/master_status_$(date "+%y%m%d-%H%M").txt
vi /home/mysql/stop.sh
#!/bin/bash

M_File1=$(mysql -utest -pMysql@123 -S /var/lib/mysql/mysql.sock -e "show master status\G" | awk -F': ' '/File/{print $2}')
M_Position1=$(mysql -utest -pMysql@123 -S /var/lib/mysql/mysql.sock -e "show master status\G" | awk -F': ' '/Position/{print $2}')
sleep 1
M_File2=$(mysql -utest -pMysql@123 -S /var/lib/mysql/mysql.sock -e "show master status\G" | awk -F': ' '/File/{print $2}')
M_Position2=$(mysql -utest -pMysql@123 -S /var/lib/mysql/mysql.sock -e "show master status\G" | awk -F': ' '/Position/{print $2}')

i=1

while true
do

if [ $M_File1 = $M_File1 ] && [ $M_Position1 -eq $M_Position2 ]
then
   echo "ok"
   break
else
   sleep 1

   if [ $i -gt 60 ]
   then
      break
   fi
   continue
   let i++
fi
done

启动keepalived

service keepalived start|stop|restart

主库查看vip是否绑定在eth2网卡上

MySQL 如何保证高可用 mysql高可用keepalived_数据库_02

如果未成功,查看日志,并调整配置文件

[root@mater ~]# tail -100f /var/log/messages
Aug 22 20:50:53 mater Keepalived_vrrp[5284]: Registering Kernel netlink reflector
Aug 22 20:50:53 mater Keepalived_vrrp[5284]: Registering Kernel netlink command channel
Aug 22 20:50:53 mater Keepalived_healthcheckers[5283]: Netlink reflector reports IP 192.168.148.143 added
Aug 22 20:50:53 mater Keepalived_healthcheckers[5283]: Netlink reflector reports IP fe80::20c:29ff:feea:2f7 added
Aug 22 20:50:53 mater Keepalived_healthcheckers[5283]: Registering Kernel netlink reflector
Aug 22 20:50:53 mater Keepalived_healthcheckers[5283]: Registering Kernel netlink command channel
Aug 22 20:50:53 mater Keepalived_vrrp[5284]: Registering gratuitous ARP shared channel
Aug 22 20:51:13 mater Keepalived_vrrp[5284]: Opening file '/etc/keepalived/keepalived.conf'.
Aug 22 20:51:13 mater Keepalived_vrrp[5284]: Configuration is using : 66310 Bytes
Aug 22 20:51:13 mater Keepalived_vrrp[5284]: Using LinkWatch kernel netlink reflector...
Aug 22 20:51:13 mater Keepalived_healthcheckers[5283]: Opening file '/etc/keepalived/keepalived.conf'.
Aug 22 20:51:13 mater Keepalived_healthcheckers[5283]: Configuration is using : 6333 Bytes
Aug 22 20:51:13 mater Keepalived_vrrp[5284]: VRRP_Instance(VI_1) Entering BACKUP STATE
Aug 22 20:51:13 mater Keepalived_vrrp[5284]: VRRP sockpool: [ifindex(2), proto(112), unicast(0), fd(10,11)]
Aug 22 20:51:13 mater Keepalived_healthcheckers[5283]: Using LinkWatch kernel netlink reflector...
Aug 22 20:51:13 mater Keepalived[5281]: Stopping Keepalived v1.2.13 (08/22,2020)
Aug 22 20:52:07 mater Keepalived[5364]: Starting Keepalived v1.2.13 (08/22,2020)
Aug 22 20:52:07 mater Keepalived[5365]: Starting Healthcheck child process, pid=5367
Aug 22 20:52:07 mater Keepalived[5365]: Starting VRRP child process, pid=5368
Aug 22 20:52:07 mater Keepalived_vrrp[5368]: Netlink reflector reports IP 192.168.148.143 added
Aug 22 20:52:07 mater Keepalived_vrrp[5368]: Netlink reflector reports IP fe80::20c:29ff:feea:2f7 added
Aug 22 20:52:07 mater Keepalived_vrrp[5368]: Registering Kernel netlink reflector
Aug 22 20:52:07 mater Keepalived_vrrp[5368]: Registering Kernel netlink command channel
Aug 22 20:52:07 mater Keepalived_vrrp[5368]: Registering gratuitous ARP shared channel
Aug 22 20:52:07 mater Keepalived_healthcheckers[5367]: Netlink reflector reports IP 192.168.148.143 added
Aug 22 20:52:07 mater Keepalived_healthcheckers[5367]: Netlink reflector reports IP fe80::20c:29ff:feea:2f7 added
Aug 22 20:52:07 mater Keepalived_healthcheckers[5367]: Registering Kernel netlink reflector
Aug 22 20:52:07 mater Keepalived_healthcheckers[5367]: Registering Kernel netlink command channel
Aug 22 20:52:27 mater Keepalived_vrrp[5368]: Opening file '/etc/keepalived/keepalived.conf'.
Aug 22 20:52:27 mater Keepalived_vrrp[5368]: Configuration is using : 66310 Bytes
Aug 22 20:52:27 mater Keepalived_vrrp[5368]: Using LinkWatch kernel netlink reflector...
Aug 22 20:52:27 mater Keepalived_healthcheckers[5367]: Opening file '/etc/keepalived/keepalived.conf'.
Aug 22 20:52:27 mater Keepalived_healthcheckers[5367]: Configuration is using : 6333 Bytes
Aug 22 20:52:27 mater Keepalived_healthcheckers[5367]: Using LinkWatch kernel netlink reflector...
Aug 22 20:52:27 mater Keepalived_vrrp[5368]: VRRP_Instance(VI_1) Entering BACKUP STATE
Aug 22 20:52:27 mater Keepalived_vrrp[5368]: VRRP sockpool: [ifindex(2), proto(112), unicast(0), fd(10,11)]
Aug 22 20:52:27 mater Keepalived_vrrp[5368]: VRRP_Script(check_run) succeeded
Aug 22 20:52:30 mater Keepalived_vrrp[5368]: VRRP_Instance(VI_1) Transition to MASTER STATE
Aug 22 20:52:30 mater Keepalived_vrrp[5368]: VRRP_Group(VG1) Syncing instances to MASTER state
Aug 22 20:52:31 mater Keepalived_vrrp[5368]: VRRP_Instance(VI_1) Entering MASTER STATE
Aug 22 20:52:31 mater Keepalived_vrrp[5368]: VRRP_Instance(VI_1) setting protocol VIPs.
Aug 22 20:52:31 mater Keepalived_vrrp[5368]: VRRP_Instance(VI_1) Sending gratuitous ARPs on eth2 for 192.168.148.240
Aug 22 20:52:31 mater Keepalived_healthcheckers[5367]: Netlink reflector reports IP 192.168.148.240 added
Aug 22 20:52:33 mater ntpd[1532]: Listen normally on 6 eth2 192.168.148.240 UDP 123
Aug 22 20:52:33 mater ntpd[1532]: peers refreshed
Aug 22 20:52:36 mater Keepalived_vrrp[5368]: VRRP_Instance(VI_1) Sending gratuitous ARPs on eth2 for 192.168.148.240
Aug 22 21:00:11 mater dhclient[1386]: DHCPREQUEST on eth2 to 192.168.148.254 port 67 (xid=0x41f5cfe)
Aug 22 21:00:11 mater dhclient[1386]: DHCPACK from 192.168.148.254 (xid=0x41f5cfe)
Aug 22 21:00:11 mater dhclient[1386]: bound to 192.168.148.143 -- renewal in 753 seconds.
Aug 22 21:00:11 mater NetworkManager[1336]: <info> (eth2): DHCPv4 state changed renew -> renew
Aug 22 21:00:11 mater NetworkManager[1336]: <info>   address 192.168.148.143
Aug 22 21:00:11 mater NetworkManager[1336]: <info>   prefix 24 (255.255.255.0)
Aug 22 21:00:11 mater NetworkManager[1336]: <info>   gateway 192.168.148.2
Aug 22 21:00:11 mater NetworkManager[1336]: <info>   nameserver '192.168.148.2'
Aug 22 21:00:11 mater NetworkManager[1336]: <info>   domain name 'localdomain'
Aug 22 21:12:44 mater dhclient[1386]: DHCPREQUEST on eth2 to 192.168.148.254 port 67 (xid=0x41f5cfe)
Aug 22 21:12:44 mater dhclient[1386]: DHCPACK from 192.168.148.254 (xid=0x41f5cfe)
Aug 22 21:12:44 mater dhclient[1386]: bound to 192.168.148.143 -- renewal in 740 seconds.
Aug 22 21:12:44 mater NetworkManager[1336]: <info> (eth2): DHCPv4 state changed renew -> renew
Aug 22 21:12:44 mater NetworkManager[1336]: <info>   address 192.168.148.143
Aug 22 21:12:44 mater NetworkManager[1336]: <info>   prefix 24 (255.255.255.0)
Aug 22 21:12:44 mater NetworkManager[1336]: <info>   gateway 192.168.148.2
Aug 22 21:12:44 mater NetworkManager[1336]: <info>   nameserver '192.168.148.2'
Aug 22 21:12:44 mater NetworkManager[1336]: <info>   domain name 'localdomain'
Aug 22 21:25:04 mater dhclient[1386]: DHCPREQUEST on eth2 to 192.168.148.254 port 67 (xid=0x41f5cfe)
Aug 22 21:25:04 mater dhclient[1386]: DHCPACK from 192.168.148.254 (xid=0x41f5cfe)
Aug 22 21:25:04 mater NetworkManager[1336]: <info> (eth2): DHCPv4 state changed renew -> renew
Aug 22 21:25:04 mater NetworkManager[1336]: <info>   address 192.168.148.143
Aug 22 21:25:04 mater NetworkManager[1336]: <info>   prefix 24 (255.255.255.0)
Aug 22 21:25:04 mater NetworkManager[1336]: <info>   gateway 192.168.148.2
Aug 22 21:25:04 mater NetworkManager[1336]: <info>   nameserver '192.168.148.2'
Aug 22 21:25:04 mater NetworkManager[1336]: <info>   domain name 'localdomain'
Aug 22 21:25:04 mater dhclient[1386]: bound to 192.168.148.143 -- renewal in 859 seconds.
Aug 22 21:39:23 mater dhclient[1386]: DHCPREQUEST on eth2 to 192.168.148.254 port 67 (xid=0x41f5cfe)
Aug 22 21:39:23 mater dhclient[1386]: DHCPACK from 192.168.148.254 (xid=0x41f5cfe)
Aug 22 21:39:23 mater NetworkManager[1336]: <info> (eth2): DHCPv4 state changed renew -> renew
Aug 22 21:39:23 mater NetworkManager[1336]: <info>   address 192.168.148.143
Aug 22 21:39:23 mater NetworkManager[1336]: <info>   prefix 24 (255.255.255.0)
Aug 22 21:39:23 mater NetworkManager[1336]: <info>   gateway 192.168.148.2
Aug 22 21:39:23 mater NetworkManager[1336]: <info>   nameserver '192.168.148.2'
Aug 22 21:39:23 mater NetworkManager[1336]: <info>   domain name 'localdomain'
Aug 22 21:39:23 mater dhclient[1386]: bound to 192.168.148.143 -- renewal in 728 seconds.
Aug 22 21:51:31 mater dhclient[1386]: DHCPREQUEST on eth2 to 192.168.148.254 port 67 (xid=0x41f5cfe)
Aug 22 21:51:31 mater dhclient[1386]: DHCPACK from 192.168.148.254 (xid=0x41f5cfe)
Aug 22 21:51:31 mater NetworkManager[1336]: <info> (eth2): DHCPv4 state changed renew -> renew
Aug 22 21:51:31 mater NetworkManager[1336]: <info>   address 192.168.148.143
Aug 22 21:51:31 mater NetworkManager[1336]: <info>   prefix 24 (255.255.255.0)
Aug 22 21:51:31 mater NetworkManager[1336]: <info>   gateway 192.168.148.2
Aug 22 21:51:31 mater NetworkManager[1336]: <info>   nameserver '192.168.148.2'
Aug 22 21:51:31 mater NetworkManager[1336]: <info>   domain name 'localdomain'
Aug 22 21:51:31 mater dhclient[1386]: bound to 192.168.148.143 -- renewal in 843 seconds.
Aug 22 22:05:34 mater dhclient[1386]: DHCPREQUEST on eth2 to 192.168.148.254 port 67 (xid=0x41f5cfe)
Aug 22 22:05:34 mater dhclient[1386]: DHCPACK from 192.168.148.254 (xid=0x41f5cfe)
Aug 22 22:05:34 mater dhclient[1386]: bound to 192.168.148.143 -- renewal in 870 seconds.
Aug 22 22:05:34 mater NetworkManager[1336]: <info> (eth2): DHCPv4 state changed renew -> renew
Aug 22 22:05:34 mater NetworkManager[1336]: <info>   address 192.168.148.143
Aug 22 22:05:34 mater NetworkManager[1336]: <info>   prefix 24 (255.255.255.0)
Aug 22 22:05:34 mater NetworkManager[1336]: <info>   gateway 192.168.148.2
Aug 22 22:05:34 mater NetworkManager[1336]: <info>   nameserver '192.168.148.2'
Aug 22 22:05:34 mater NetworkManager[1336]: <info>   domain name 'localdomain'

测试 

通过vip远程登录mysql

通过vip远程登录mysql

MySQL 如何保证高可用 mysql高可用keepalived_数据库_03

建表并录入数据

mysql> create table t2(id int,name varchar(100));
Query OK, 0 rows affected (0.07 sec)

mysql>
mysql> insert into t2 values (1,'abc');
Query OK, 1 row affected (0.02 sec)

模拟mysql 主库 crash

主库执行

pkill -9 mysqld
[root@mater mysql]# pkill -9 mysqld
[root@mater mysql]# 
[root@mater mysql]# 
[root@mater mysql]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:0c:29:ea:02:f7 brd ff:ff:ff:ff:ff:ff
    inet 192.168.148.143/24 brd 192.168.148.255 scope global eth2
    inet6 fe80::20c:29ff:feea:2f7/64 scope link 
       valid_lft forever preferred_lft forever

查看vip是否飘到备库

[root@candicate ~]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:0c:29:05:64:83 brd ff:ff:ff:ff:ff:ff
    inet 192.168.148.144/24 brd 192.168.148.255 scope global eth2
    inet 192.168.148.240/32 scope global eth2
    inet6 fe80::20c:29ff:fe05:6483/64 scope link 
       valid_lft forever preferred_lft forever
[root@candicate ~]#

此时客户端正常连接

查看切换的时候的binlog位置

[root@candicate tmp]# more /tmp/master_status_200822-2249.txt
File Position Binlog_Do_DB Binlog_Ignore_DB Executed_Gtid_Set
binlog.000001 120
切换后的处理

切换后,排查主库问题,启动主库,然后将主库变成新主库(原从库)的从库
测试过,当新主库异常关闭后,会自动切换到新的从库