keepalived+mysql双主原理:

(1)主库1上的keepalived启动之后,会检查mysql服务是否活着,如果活着,keepalived进入master状态,获得VIP;

(2)主库2上的keepalived启动之后,也会检查mysql是否活着,然后检查keepalived组内是否有master状态,如果有,则主库2上的keepalived进入backup状态,处于随时接管VIP状态;

(3)如果主库1上的mysql挂了,keepalived进入fault状态,释放VIP,主库2上的keepalived会变成master状态,获得VIP;


实验环境:

OS:CentOS release 6.6 (Final)

数据库:mysql 5.7.14

A: master :192.168.91.23

B: slave :192.168.91.22 

VIP:192.168.91.100


操作系统时间一致更改:

date -s "20170227 16:25"

hwclock --systohc


主从参数:

A:

server_id = 330623

gtid_mode=ON

log_slave_updates = 0

enforce_gtid_consistency = ON

auto_increment_offset =1

auto_increment_increment =2

B:

server_id = 330622

gtid_mode=ON

log_slave_updates = 0

enforce_gtid_consistency = ON

auto_increment_offset=2

auto_increment_increment=2


配置AB互为主从:

A:

创建复制账户:

create user rep@'192.168.91.%' identified by '147258';

grant replication slave on *.* to rep@'192.168.91.%';

把A做个全备,还原到B上(这里省略不写)


B:添加A为B的主库:

change master to master_host='192.168.91.23', master_port=3306, master_user='rep',master_password='147258', master_auto_position=1;

start slave;


A:添加B为A的主库:

change master to master_host='192.168.91.22', master_port=3306, master_user='rep',master_password='147258', master_auto_position=1;

start slave;


创建一个监控账户:(后面checkMySQL.py 脚本会用到,用于检测mysql数据库状态,这个用户只要有usage权限即可)

GRANT REPLICATION CLIENT ON *.* TO 'monitor'@'%' IDENTIFIED BY 'm0n1tor';

A和B都要安装keepalived软件:

 yum install keepalived -y

 yum install MySQL-python -y


A的keepalived配置文件:

[root@Darren1 keepalived]#cat << EOF > keepalived.conf

vrrp_script vs_mysql_23 {         #可以根据实际情况命名

    script "/etc/keepalived/checkMySQL.py -h 192.168.91.23 -P 3306"

    interval 60      #切换时间

}

vrrp_instance VI_23 {          #可以根据实际情况命名

    state BACKUP               #刚开始时使其处于backup状态

    nopreempt                  #设置为不抢占,m1挂了,m2接管VIP,m1重启不会自动抢回VIP

    interface eth0             #VIP用的网卡

    virtual_router_id 23       #路由id,范围是0-255,不能和路由器高可用的id一样,同一集群中该数值要相同

    priority 100               #优先级,同一个vrrp_instance的MASTER优先级必须比BACKUP高。

    advert_int 5

    authentication {

        auth_type PASS   #认证加密

        auth_pass 1111   # 认证密码,但密码不要超过 8 位

    }

    track_script {

       vs_mysql_23     #调用这个脚本,返回0就持有VIP,返回1就释放VIP

    }

    virtual_ipaddress {

        192.168.11.100     #VIP地址

    }

}

EOF


B配置文件:

[root@Darren2 keepalived]# cat << EOF > keepalived.conf

vrrp_script vs_mysql_22 {

    script "/etc/keepalived/checkMySQL.py -h 192.168.91.22 -P 3306"    #此处和A不同,其他都相同

    interval 60

}

vrrp_instance VI_22 {

    state BACKUP

    nopreempt

    interface eth0

    virtual_router_id 23

    priority 90

    advert_int 5

    authentication {

        auth_type PASS

        auth_pass 1111

    }

    track_script {

       vs_mysql_22

    }

    virtual_ipaddress {

        192.168.91.100

    }

}

EOF


checkMySQL.py脚本作用(这里省略不写):

脚本的作用是判断mysql进程是否存在,如果存在返回0,如果不存在返回1;



A和B启用keepalived

/etc/init.d/keepalived start   (开始开的时候,A和B谁先启动,VIP就先在谁上)

 chkconfig –level 2345 keepalived on


keepalived启动过程:

此时A开启keepalived服务:

[root@Darren1 ~]# /etc/init.d/keepalived start

[root@Darren1 ~]# tail -f /var/log/messages       

May  9 14:41:05 Darren1 Keepalived[28172]: Starting Keepalived v1.2.13 (03/19,2015)

May  9 14:41:05 Darren1 Keepalived[28173]: Starting Healthcheck child process, pid=28175

May  9 14:41:05 Darren1 Keepalived[28173]: Starting VRRP child process, pid=28176

May  9 14:41:05 Darren1 Keepalived_vrrp[28176]: Netlink reflector reports IP 192.168.91.23 added

May  9 14:41:05 Darren1 Keepalived_healthcheckers[28175]: Netlink reflector reports IP 192.168.91.23 added

May  9 14:41:05 Darren1 Keepalived_vrrp[28176]: Netlink reflector reports IP fe80::20c:29ff:fe56:5380 added

May  9 14:41:05 Darren1 Keepalived_vrrp[28176]: Registering Kernel netlink reflector

May  9 14:41:05 Darren1 Keepalived_vrrp[28176]: Registering Kernel netlink command channel

May  9 14:41:05 Darren1 Keepalived_vrrp[28176]: Registering gratuitous ARP shared channel

May  9 14:41:05 Darren1 Keepalived_vrrp[28176]: Opening file '/etc/keepalived/keepalived.conf'.

May  9 14:41:05 Darren1 Keepalived_healthcheckers[28175]: Netlink reflector reports IP fe80::20c:29ff:fe56:5380 added

May  9 14:41:05 Darren1 Keepalived_healthcheckers[28175]: Registering Kernel netlink reflector

May  9 14:41:05 Darren1 Keepalived_healthcheckers[28175]: Registering Kernel netlink command channel

May  9 14:41:05 Darren1 Keepalived_healthcheckers[28175]: Opening file '/etc/keepalived/keepalived.conf'.

May  9 14:41:05 Darren1 Keepalived_vrrp[28176]: Configuration is using : 62873 Bytes

May  9 14:41:05 Darren1 Keepalived_vrrp[28176]: Using LinkWatch kernel netlink reflector...

May  9 14:41:05 Darren1 Keepalived_vrrp[28176]: VRRP_Instance(VI_23) Entering BACKUP STATE

May  9 14:41:05 Darren1 Keepalived_healthcheckers[28175]: Configuration is using : 5173 Bytes

May  9 14:41:05 Darren1 Keepalived_vrrp[28176]: VRRP sockpool: [ifindex(2), proto(112), unicast(0), fd(10,11)]

May  9 14:41:05 Darren1 Keepalived_healthcheckers[28175]: Using LinkWatch kernel netlink reflector...

May  9 14:41:05 Darren1 Keepalived_vrrp[28176]: VRRP_Script(vs_mysql_23) succeeded

May  9 14:41:21 Darren1 Keepalived_vrrp[28176]: VRRP_Instance(VI_23) Transition to MASTER STATE

May  9 14:41:26 Darren1 Keepalived_vrrp[28176]: VRRP_Instance(VI_23) Entering MASTER STATE

May  9 14:41:26 Darren1 Keepalived_vrrp[28176]: VRRP_Instance(VI_23) setting protocol VIPs.

May  9 14:41:26 Darren1 Keepalived_vrrp[28176]: VRRP_Instance(VI_23) Sending gratuitous ARPs on eth0 for 192.168.91.100

May  9 14:41:26 Darren1 Keepalived_healthcheckers[28175]: Netlink reflector reports IP 192.168.91.100 added

May  9 14:41:31 Darren1 Keepalived_vrrp[28176]: VRRP_Instance(VI_23) Sending gratuitous ARPs on eth0 for 192.168.91.100


总结启动过程:

(1)启动keepalived三个进程,分别是主进程,健康检查子进程,VRRP协议子进程;

(2)启动结束后,VRRP_Instance开始进入backup状态;

(3)进入backup成功后,VRRP_Instance转变状态为master,然后进入master状态;

(4)获取VIP,并且用ARP广播告诉其他服务器;


keepalived切换过程:


停止A的mysql服务:

[root@Darren1 ~]# /etc/init.d/mysqld stop

Shutting down MySQL............ SUCCESS!

此时A的变化: 

[root@Darren1 ~]# tail -f /var/log/messages

May  9 14:43:25 Darren1 Keepalived_vrrp[28176]: VRRP_Script(vs_mysql_23) failed

May  9 14:43:26 Darren1 Keepalived_vrrp[28176]: VRRP_Instance(VI_23) Entering FAULT STATE

May  9 14:43:26 Darren1 Keepalived_vrrp[28176]: VRRP_Instance(VI_23) removing protocol VIPs.

May  9 14:43:26 Darren1 Keepalived_vrrp[28176]: VRRP_Instance(VI_23) Now in FAULT state

May  9 14:43:26 Darren1 Keepalived_healthcheckers[28175]: Netlink reflector reports IP 192.168.91.100 removed


总结A的变化:VRRP_Instance进入fault状态,释放VIP;


此时B的变化: 

[root@Darren2 ~]# tail -f /var/log/messages       

May  9 14:43:26 Darren2 Keepalived_vrrp[35138]: VRRP_Instance(VI_22) Transition to MASTER STATE

May  9 14:43:31 Darren2 Keepalived_vrrp[35138]: VRRP_Instance(VI_22) Entering MASTER STATE

May  9 14:43:31 Darren2 Keepalived_vrrp[35138]: VRRP_Instance(VI_22) setting protocol VIPs.

May  9 14:43:31 Darren2 Keepalived_vrrp[35138]: VRRP_Instance(VI_22) Sending gratuitous ARPs on eth0 for 192.168.91.100

May  9 14:43:31 Darren2 Keepalived_healthcheckers[35137]: Netlink reflector reports IP 192.168.91.100 added

May  9 14:43:36 Darren2 Keepalived_vrrp[35138]: VRRP_Instance(VI_22) Sending gratuitous ARPs on eth0 for 192.168.91.100


总结B的变化:VRRP_Instance进入master状态,获得VIP,ARP广播通知;


使用VIP登陆数据库:

创建登陆用户:

create user 'keepalived'@'%' identified by '147258';

grant all on *.* to 'keepalived'@'%';

此时VIP在B上:

[root@Darren2 ~]# ip addr |grep 192

    inet 192.168.91.22/24 brd 192.168.91.255 scope global eth0

    inet 192.168.91.100/32 scope global eth0

在A上用'keepalived'@'%'账户登陆,是可以登陆成功的,证明此VIP是有效的:

[root@Darren1 ~]# mysql -ukeepalived -p147258 -h192.168.91.100

keepalived@192.168.91.100 [(none)]>select user(),current_user();

+--------------------------+----------------+

| user()                   | current_user() |

+--------------------------+----------------+

| keepalived@192.168.91.23 | keepalived@%   |

+--------------------------+----------------+


总结:

几种VIP切换情况:

(1)m1主机宕机,VIP会切换到m2;

(2)m1上的mysql挂了,VIP会切换到m2;

(3)m1上的keepalived服务挂了,又分为两种情况:

/etc/init.d/keepalived stop:正常切换

kill -9 keepalived_pid:因为keepalived是直接退出,m1和m2都有VIP,但是连接时候只有一个是生效的;

(4)脑裂的情况,m1和m2都各自认为自己是master状态,抢占VIP,VIP一会在m1上,一会在m2上;

在同一个交换机下不存在脑裂情况,这个在比较复杂的网络环境中会发生。

可以在脚本中防范:ping一下网关,如果连网关都ping不通,vrrp_script脚本就放回1,keepalived进入fault状态;


m1挂了,m2接管VIP,m1修复好了之后怎么办?

(1)如果是GTID复制直接把m1change master to m2上,如果是传统复制,需要找到binlog位置;

(2)等待m1同步完成;

(3)启动keepalived;


keepalived+mysql双主缺点和对应方法:

(1)数据库一致性难保障:

可以使用增强半同步,把主库等待从库回应的参数rpl_semi_sync_master_timeout 调大点,

出现master的日志没能实时的传到slave上,需要手工把binlog截取出来补到从库上;如果系统不存在了,可以通过binlog server 补日志;

(2)需要手动把出现故障的主库加入到原来的结构中;