keepalived+redis主从自动切换
目录
环境: centos 5.8 64bit
keepalived : keepalived-1.1.15
redis: redis-2.8.19
版本历史
时间 | 版本 | 说明 | 编写者 |
2015-1-9 | 1.1 | keepalived+redis主从自动切换 | csc |
设计思路:
在keepalived+redis的使用过程中有四种情况:
1 一种是keepalived挂了,同时redis也挂了,这样的话直接VIP飘走之后,是不需要进行redis数据同步的,因为redis挂了,你也无法去master上同步,不过会损失已经写在master上却还没同步到slave上面的这部分数据。
2 另一种是keepalived挂了,redis没挂,这时候VIP飘走后,redis的master/slave还是老的对应关系,如果不变化的话会把数据写入redis slave中,从而不会同步到master上去,这就要借助监控脚本反转redis的master/slave关系。这时候就要预留一点时间进行数据同步,然后反转master/slave。
3 还有一种是keepalived没挂,redis挂了,这时候根据监控脚本会检测到redis挂了,并且降低keepalived master的优先级,同样会导致VIP飘走,情况和第二种一样,也是需要进行数据同步,然后反转当前redis的master/slave关系的。
4 随后一种是keepalived没挂,redis也没挂,大吉大利啊,什么都不用操作。
本文的实验环境四种情况都适合,第一种是不需要同步数据的,脚本会默认去同步数据,但是其实是不会成功的。脚本主要是用来处理第二和第三种情况的。
安装:
主备机都安装keepalived
yum -y install ipvsadm (好像可以不用安装)
lsmod ip_vs
modprobe ip_vs
tar -xvzf keepalived-1.1.15.tar.gz
cdkeepalived-1.1.15
./configure
make
make install
mkdir/etc/keepalived
cp/usr/local/etc/rc.d/init.d/keepalived /etc/rc.d/init.d/
cp/usr/local/etc/sysconfig/keepalived /etc/sysconfig/
cp/usr/local/etc/keepalived/keepalived.conf /etc/keepalived/
cp/usr/local/sbin/keepalived /usr/sbin/
/etc/init.d/keepalived start
ps -ef|grepkeepalived
主备机都安装redis
并配置主从服务。
参考redis主从配置。
主机配置:
#cat /etc/keepalived/keepalived.conf
! Configuration File for keepalived
vrrp_script chk_redis {
script"/etc/keepalived/scripts/redis_check.sh"
interval 1 #检查时间间隔
weight -20 #注意这里必须写,根据script 返回的结果1来更改优先级,进行主备切换;返回0则不更改。很多网站都漏写这行,导致不能切换 #脚本结果导致的优先级变更:10表示优先级+20;-20则表示优先级-20
}
定义好vrrp_script代码块之后,就可以在instance中使用了
vrrp_instance VI_1 {
state MASTER
interfaceeth0
virtual_router_id 51
nopreempt
priority 100
advert_int 1 ##心跳广播时间间隔 秒 默认检测三次,一共是1秒x3=3秒,延长3秒开始切换脚本,是为了让redis_backup/master.sh脚本有充足的时间执行完毕
authentication {
auth_type PASS
auth_pass 1111
}
track_script{
chk_redis
}
virtual_ipaddress {
10.8.10.130
}
notify_master /etc/keepalived/scripts/redis_master.sh
notify_backup /etc/keepalived/scripts/redis_slave.sh
notify_fault /etc/keepalived/scripts/redis_fault.sh
notify_stop /etc/keepalived/scripts/redis_stop.sh
}
notify_stop keepalived停止运行前运行notify_stop指定的脚本。
notify_master keepalived切换到master时执行的脚本
notify_backup keepalived切换到backup时执行的脚本
notify_fault keepalived出现故障时执行的脚本
在script下有五个脚本,一个是检测redis状态的redis_check.sh脚本,其余四个是keepalived状态变化时执行的脚本。keepalived有master/backup/stop/fault四种状态,因为我们主要是关注系统上的业务,所以在在keepalived进入fault/stop状态后,也认为是进入了backup状态,需要对redis的master/slave关系进行反转,否则即使VIP漂移过去,但是redis的主从关系还没有改变,会导致数据不一致,所以最终四个脚本只有两种内容。
还有个问题需要注意:当master down了,backup接管了,master再次起来,不能再成为master。否则master恢复了再接管的话,会造成业务来回切换,这时候就需要nopreempt参数了。
nopreempt:设置不抢占,这里只能设置在state为backup的节点上,而且这个节点的优先级必须别另外的高。
状态为master的脚本:
[root@local-115 keepalived]# cat/etc/keepalived/scripts/redis_master.sh
#!/bin/bash
REDISCLI="/usr/local/bin/redis-cli"
LOGFILE="/var/log/keepalived-redis-state.log"
echo "[master]" >> $LOGFILE
date >> $LOGFILE
echo "Being master...." >> $LOGFILE2>&1
echo "Run SLAVEOF cmd ..." >> $LOGFILE
$REDISCLI SLAVEOF 10.8.10.128 6379 >>$LOGFILE 2>&1
sleep 10
#延迟10秒以后待数据同步完成后再取消同步状态
echo "Run SLAVEOF NO ONE cmd ..." >>$LOGFILE
$REDISCLI SLAVEOF NO ONE >> $LOGFILE 2>&1
##执行/usr/local/bin/redis-cli SLAVEOF NO ONE,让redis保持master状态
状态为slave 的脚本
[root@local-115 keepalived]# cat/etc/keepalived/scripts/redis_slave.sh
#!/bin/bash
REDISCLI="/usr/local/bin/redis-cli"
LOGFILE="/var/log/keepalived-redis-state.log"
echo "[backup]" >> $LOGFILE
date >> $LOGFILE
echo "Being slave...." >> $LOGFILE 2>&1
sleep 15 #延迟15秒待数据被对方同步完成之后再切换主从角色
echo "Run SLAVEOF cmd ..." >> $LOGFILE
$REDISCLI SLAVEOF 10.8.10.128 6379 >>$LOGFILE 2>&1
状态为fault 的脚本
[root@local-115 keepalived]# cat/etc/keepalived/scripts/redis_fault.sh
#!/bin/bash
LOGFILE=/var/log/keepalived-redis-state.log
echo "[fault]" >> $LOGFILE
date >> $LOGFILE
备机相同
状态为stop 的脚本
[root@local-115 keepalived]# cat/etc/keepalived/scripts/redis_stop.sh
#!/bin/bash
LOGFILE=/var/log/keepalived-redis-state.log
echo "[stop]" >> $LOGFILE
date >> $LOGFILE
备机相同
备机配置:
# cat /etc/keepalived/keepalived.conf
! Configuration File for keepalived
vrrp_script chk_redis {
script"/etc/keepalived/scripts/redis_check.sh"
interval 1 #检查时间间隔
weight-20 #注意这里必须写,根据script 返回的结果1来更改优先级,进行主备切换;返回0则不更改。很多网站都漏写该行,导致不能切换。#脚本结果导致的优先级变更:10表示优先级+20;-20则表示优先级-20
}
定义好vrrp_script代码块之后,就可以在instance中使用了
vrrp_instance VI_1 {
state BACKUP
interfaceeth0
virtual_router_id 51
priority 90
advert_int 1 ##心跳广播时间间隔 秒 默认检测三次,一共是1秒x3=3秒,延长3秒开始切换脚本,是为了让redis_backup/master.sh脚本有充足的时间执行完毕
authentication {
auth_type PASS
auth_pass 1111
}
track_script{
chk_redis
}
virtual_ipaddress {
10.8.10.130
}
notify_master /etc/keepalived/scripts/redis_master.sh
notify_backup /etc/keepalived/scripts/redis_slave.sh
notify_fault /etc/keepalived/scripts/redis_fault.sh
notify_stop /etc/keepalived/scripts/redis_stop.sh
}
状态为master的脚本:
[root@P-client01 scripts]# cat /etc/keepalived/scripts/redis_master.sh
#!/bin/bash
REDISCLI="/usr/local/bin/redis-cli"
LOGFILE="/var/log/keepalived-redis-state.log"
echo "[master]" >> $LOGFILE
date >> $LOGFILE
echo "Being master...." >> $LOGFILE2>&1
echo "Run SLAVEOF cmd ..." >> $LOGFILE
$REDISCLI SLAVEOF 10.8.10.115 6379 >>$LOGFILE 2>&1
sleep 10
#延迟10秒以后待数据同步完成后再取消同步状态
echo "Run SLAVEOF NO ONE cmd ..." >>$LOGFILE
$REDISCLI SLAVEOF NO ONE >> $LOGFILE 2>&1
##执行/usr/local/bin/redis-cli SLAVEOF NO ONE,让redis保持master状态
状态为slave 的脚本
[root@P-client01 scripts]# cat/etc/keepalived/scripts/redis_slave.sh
#!/bin/bash
REDISCLI="/usr/local/bin/redis-cli"
LOGFILE="/var/log/keepalived-redis-state.log"
echo "[backup]" >> $LOGFILE
date >> $LOGFILE
echo "Being slave...." >> $LOGFILE2>&1
sleep 15
#延迟15秒待数据被对方同步完成之后再切换主从角色
echo "Run SLAVEOF cmd ..." >> $LOGFILE
$REDISCLI SLAVEOF 10.8.10.115 6379 >>$LOGFILE 2>&1
状态为fault 的脚本
[root@P-client01 scripts]# cat /etc/keepalived/scripts/redis_fault.sh
#!/bin/bash
LOGFILE=/var/log/keepalived-redis-state.log
echo "[fault]" >> $LOGFILE
date >> $LOGFILE
主机相同
状态为stop 的脚本
[root@P-client01 scripts]# cat/etc/keepalived/scripts/redis_stop.sh
#!/bin/bash
LOGFILE=/var/log/keepalived-redis-state.log
echo "[stop]" >> $LOGFILE
date >> $LOGFILE
主机相同
流程测试:
1.启动Master上的Redis
[root@redis bin]# pwd /usr/local/bin [root@redis bin]# ./redis-serverredis.conf
2.启动Slave上的Redis
[root@redisbin]# pwd /usr/local/bin [root@redisbin]# ./redis-server redis.conf
3.启动Master上的Keepalived
/etc/init.d/keepalived start
4.启动Slave上的Keepalived
/etc/init.d/keepalived start
5.尝试通过VIP连接Redis:
[root@redis bin]#pwd /usr/local/bin
[root@redis bin]# ./redis-cli -h 192.168.1.237 inforole:master slave0:192.168.1.236,6379,online 连接成功,Slave也连接上来了
6.尝试插入一些数据:
[root@redisbin]# ./redis-cli -h 192.168.1.237 SET Hello Redis
从VIP读取数据 [root@redis bin]# ./redis-cli -h192.168.1.237 GET Hello "Redis" 从Master读取数据
[root@redis bin]# ./redis-cli -h 192.168.1.235 GET Hello "Redis"
从Slave读取数据 [root@redis-slave bin]# ./redis-cli -h192.168.1.235 GET Hello"Redis"
模拟故障产生:
将Master上的Redis进程杀死: [root@redis bin]#./redis-cli shutdown
查看Master上的Keepalived日志
[root@redisscripts]# tail /var/log/keepalived-redis-state.log [fault] Thu Sep 27 08:29:01 CST 2012
同时Slave上的日志显示: [root@redis-slave scripts]# tail/var/log/keepalived-redis-state.log [master] Thu Nov 15 12:06:04 CST 2012 Being master.... Run SLAVEOF cmd ... OK Run SLAVEOF NO ONE cmd ... OK
然后我们可以发现,Slave已经接管服务,并且担任Master的角色了。
./redis-cli -h192.168.1.237 info ./redis-cli -h 192.168.1.236 info role:master
然后我们恢复Master的Redis进程主变成slave 然后把236redis停掉 235恢复主的角色,在把236redis开启 恢复235是主,236是备 自动切换成功