HA 即 (high available)高可用,又被叫做双机热备,用于关键性业务。 简单理解就是,有两台机器A和B,正常是A提供服务,B待命闲置,当A宕机或服务宕掉,会切换至B机器继续提供服务。常用实现高可用的开源软件有heartbeat和keepalived,其中keepalived有负载均衡的功能。

HA == high available

HA集群配置 (nginx)_HA集群配置

主从,主宕机,从就起来。默认从上是不工作的。

heartbeat ------> HA

关闭 防火墙

iptables -F 

getenforce 0

两台机器 主从

1、修改hostname 且关闭防火墙

主机ip 192.168.0.107  主机hostname 修改为  hostname master

然后bash 

从机ip 192.168.0.108 从机hostname修改为hostname slave

然后bash

主机:

HA集群配置 (nginx)_HA集群配置_02


从机:

HA集群配置 (nginx)_HA集群配置_03

2、修改hosts

在主从的/etc/hosts文件里均添加 两行

192.168.0.107 master
192.168.0.108 slave

主机:

HA集群配置 (nginx)_HA集群配置_04


从机:

HA集群配置 (nginx)_HA集群配置_05


2.1 安装扩展源

首先安装epel扩展源。以下是对应的扩展源,根据自己的系统选择安装。

首先现在如下rpm包,然后安装对应的rpm包
centos5 32位epel源下载地址: www.lishiming.net/data/p_w_upload/forum/epel-release-5-4_32.noarch.rpm
64位下载地址:  www.lishiming.net/data/p_w_upload/forum/epel-release-5-4_64.noarch.rpm
centos6
32位epel yum源下载地址: www.lishiming.net/data/p_w_upload/forum/epel-release-6-8_32.noarch.rpm 
64位下载地址: www.lishiming.net/data/p_w_upload/forum/epel-release-6-8_64.noarch.rpm
主机:

HA集群配置 (nginx)_HA集群配置_06

从机:

HA集群配置 (nginx)_HA集群配置_07

3、安装heartbeat 开源原件

主从机上都安装

yum install -y heartbeat

yum install -y libnet

4、拷贝文件 (3个文件)

[root@master ~]# cd /usr/share/doc/heartbeat-3.0.4/
[root@master heartbeat-3.0.4]# ls
apphbd.cf  AUTHORS    COPYING       ha.cf        README
authkeys   ChangeLog  COPYING.LGPL  haresources
[root@master heartbeat-3.0.4]# cp authkeys ha.cf haresources  /etc/ha.d/

[root@master heartbeat-3.0.4]# cd /etc/ha.d/
[root@master ha.d]# vim authkeys 

authkeys  主从验证文件

HA集群配置 (nginx)_HA集群配置_08

其中 有三种验证方式 1最简单。2最难,3次之。我们选择3。

5、修改文件权限

[root@master ha.d]# pwd
/etc/ha.d

[root@master ha.d]# chmod 600 authkeys    #否则 heartbeat启动不了 只能让当前用户去读root用户

6、修改配置文件 

[root@master ha.d]# vim haresources

HA集群配置 (nginx)_HA集群配置_09

192.168.0.110 就是主机和从机对外的流动ip地址(也成为vip或虚拟ip),只要设置成和主从机同网段没有被使用IP即可。

其中niginx 就是资源。service xxx start 其中xxx就是nginx,所以以后mysql的时候就是mysqld

7、修改ha.cf

主机:

[root@master ha.d]# >ha.cf
[root@master ha.d]# vim !$
vim ha.cf

debugfile /var/log/ha-debug
logfile /var/log/ha-log
logfacility     local0
keepalive 2   #两秒(单位是秒)探测一次是否宕机
deadtime 30 #如果3秒ping不通了就死掉了
warntime 10 #10秒ping不通,警告
initdead 60
udpport 694#心跳线端口
ucast eth0 192.168.0.108 #对方的ip也就是从机的IP
auto_failback on  #如果备机上切换过去了,当主机再次激活时候,从机就放弃。在切回来
node    master
node    slave
ping 192.168.0.1
respawn hacluster /usr/lib/heartbeat/ipfail  #检测网络连接线

HA集群配置 (nginx)_HA集群配置_10

[root@master ha.d]# scp authkeys haresources ha.cf slave:/etc/ha.d
The authenticity of host 'slave (192.168.0.108)' can't be established.
RSA key fingerprint is 6b:a1:11:e9:1b:9d:69:dc:e4:a2:0c:5f:83:7a:78:70.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'slave,192.168.0.108' (RSA) to the list of known hosts.
root@slave's password: 
authkeys                                                                         100%  643     0.6KB/s   00:00    
haresources                                                                      100% 5945     5.8KB/s   00:00    
ha.cf

把主机上的这个三个文件拷贝到从机上去。

修改ha.cf

HA集群配置 (nginx)_HA集群配置_11

8、 启动heartbeat : 

先主,后从
service heartbeat start


主机:

[root@master ha.d]# /etc/init.d/heartbeat  start
Starting High-Availability services: INFO:  Resource is stopped
Done.

[root@master ha.d]# pa aux|grep niginx
-bash: pa: command not found
[root@master ha.d]# ps aux|grep niginx
root      5055  0.0  0.0   4416   772 pts/2    S+   04:31   0:00 grep niginx
[root@master ha.d]# ps aux|grep niginx
root      5057  0.0  0.0   4416   776 pts/2    S+   04:31   0:00 grep niginx
[root@master ha.d]# ps aux|grep niginx
root      5059  0.0  0.0   4416   776 pts/2    S+   04:31   0:00 grep niginx
[root@master ha.d]# 

过一会,因为首次启动。

[root@master ha.d]# ifconfig
eth0      Link encap:Ethernet  HWaddr 00:50:56:3E:EC:CF  
          inet addr:192.168.0.107  Bcast:192.168.0.255  Mask:255.255.255.0
          inet6 addr: fe80::250:56ff:fe3e:eccf/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:381719 errors:1 dropped:0 overruns:0 frame:0
          TX packets:23992 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:66229125 (63.1 MiB)  TX bytes:1957012 (1.8 MiB)
          Interrupt:19 Base address:0x2000 

eth0:0    Link encap:Ethernet  HWaddr 00:50:56:3E:EC:CF   #已经启动了vip(流动ip也就是对外的IP)
          inet addr:192.168.0.110  Bcast:192.168.0.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          Interrupt:19 Base address:0x2000 

eth0:1    Link encap:Ethernet  HWaddr 00:50:56:3E:EC:CF  
          inet addr:192.168.0.109  Bcast:192.168.0.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          Interrupt:19 Base address:0x2000 

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:20 errors:0 dropped:0 overruns:0 frame:0
          TX packets:20 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:1320 (1.2 KiB)  TX bytes:1320 (1.2 KiB)

[root@master ha.d]# ps aux|grep nginx
root      5049  0.0  0.1  15288  1468 ?        Ss   04:31   0:00 nginx: master process /usr/sbin/nginx -c /etc/nginx/nginx.conf
nginx     5051  0.0  0.1  15444  1920 ?        S    04:31   0:00 nginx: worker process                   
root     12443  0.0  0.0   4420   756 pts/1    S+   04:36   0:00 grep nginx
此时看到nginx也启动了。

从机:

[root@slave ha.d]# /etc/init.d/heartbeat start
Starting High-Availability services: INFO:  Resource is stopped
Done.

[root@slave ha.d]# ps aux|grep nginx
root     11598  0.0  0.0   4420   756 pts/1    S+   04:38   0:00 grep nginx
说明从机上的nginx没有启动。正确。因为从机的nginx只有主机宕机以后从机上的nginx才能启动。要是启动的话HA就没成功。

修改主机的nginx启动页面。之前是

HA集群配置 (nginx)_HA集群配置_12


[root@master ha.d]# cd /usr/share/nginx/html/
[root@master html]# ls
404.html  50x.html  index.html  nginx-logo.png  poweredby.png
[root@master html]# cp index.html index.html.bak
[root@master html]# echo "11111master">index.html

[root@master html]# echo "11111master">index.html

修改后:

HA集群配置 (nginx)_HA集群配置_13

从机上也写一个东西到nginx默认主页。

[root@slave ha.d]# cd /usr/share/nginx/html/
404.html        50x.html        index.html      nginx-logo.png  poweredby.png   
[root@slave ha.d]# cd /usr/share/nginx/html/
[root@slave html]# ls
404.html  50x.html  index.html  nginx-logo.png  poweredby.png
[root@slave html]# cp index.html index.html.bak
[root@slave html]# echo "2222222slave">index.html

从上还没启动nginx。

测试1
主上故意禁ping

[root@master html]# iptables -I INPUT -p icmp -j DROP

看日志:

[root@master html]# cat /var/log/ha-log
Nov 15 04:30:10 master heartbeat: [4545]: info: Pacemaker support: false
Nov 15 04:30:10 master heartbeat: [4545]: WARN: Logging daemon is disabled --enabling logging daemon is recommended
Nov 15 04:30:10 master heartbeat: [4545]: info: **************************
Nov 15 04:30:10 master heartbeat: [4545]: info: Configuration validated. Starting heartbeat 3.0.4
Nov 15 04:30:10 master heartbeat: [4546]: info: heartbeat: version 3.0.4
Nov 15 04:30:10 master heartbeat: [4546]: WARN: No Previous generation - starting at 1447590611
Nov 15 04:30:10 master heartbeat: [4546]: info: Heartbeat generation: 1447590611
Nov 15 04:30:10 master heartbeat: [4546]: info: No uuid found for current node - generating a new uuid.
Nov 15 04:30:10 master heartbeat: [4546]: info: Creating FIFO /var/lib/heartbeat/fifo.
Nov 15 04:30:10 master heartbeat: [4546]: info: glib: ucast: write socket priority set to IPTOS_LOWDELAY on eth0
Nov 15 04:30:10 master heartbeat: [4546]: info: glib: ucast: bound send socket to device: eth0
Nov 15 04:30:10 master heartbeat: [4546]: info: glib: ucast: set SO_REUSEPORT(w)
Nov 15 04:30:10 master heartbeat: [4546]: info: glib: ucast: bound receive socket to device: eth0
Nov 15 04:30:10 master heartbeat: [4546]: info: glib: ucast: set SO_REUSEPORT(w)
Nov 15 04:30:10 master heartbeat: [4546]: info: glib: ucast: started on port 694 interface eth0 to 192.168.0.108
Nov 15 04:30:10 master heartbeat: [4546]: info: glib: ping heartbeat started.
Nov 15 04:30:10 master heartbeat: [4546]: info: G_main_add_TriggerHandler: Added signal manual handler
Nov 15 04:30:10 master heartbeat: [4546]: info: G_main_add_TriggerHandler: Added signal manual handler
Nov 15 04:30:10 master heartbeat: [4546]: info: G_main_add_SignalHandler: Added signal handler for signal 17
Nov 15 04:30:10 master heartbeat: [4546]: info: Local status now set to: 'up'
Nov 15 04:30:10 master heartbeat: [4546]: info: Link 192.168.0.1:192.168.0.1 up.
Nov 15 04:30:10 master heartbeat: [4546]: info: Status update for node 192.168.0.1: status ping
Nov 15 04:31:11 master heartbeat: [4546]: WARN: node slave: is dead
Nov 15 04:31:11 master heartbeat: [4546]: info: Comm_now_up(): updating status to active
Nov 15 04:31:11 master heartbeat: [4546]: info: Local status now set to: 'active'
Nov 15 04:31:11 master heartbeat: [4546]: info: Starting child client "/usr/lib/heartbeat/ipfail" (494,491)
Nov 15 04:31:11 master heartbeat: [4546]: WARN: No STONITH device configured.
Nov 15 04:31:11 master heartbeat: [4546]: WARN: Shared disks are not protected.
Nov 15 04:31:11 master heartbeat: [4546]: info: Resources being acquired from slave.
Nov 15 04:31:11 master heartbeat: [4556]: info: Starting "/usr/lib/heartbeat/ipfail" as uid 494  gid 491 (pid 4556)
harc(default)[4557]:     2015/11/15_04:31:11 info: Running /etc/ha.d//rc.d/status status
mach_down(default)[4591]:     2015/11/15_04:31:11 info: /usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired
mach_down(default)[4591]:     2015/11/15_04:31:11 info: mach_down takeover complete for node slave.
Nov 15 04:31:11 master heartbeat: [4546]: info: mach_down takeover complete.
Nov 15 04:31:11 master heartbeat: [4546]: info: Initial resource acquisition complete (mach_down)
/usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_192.168.0.110)[4632]:     2015/11/15_04:31:11 INFO:  Resource is stopped
Nov 15 04:31:11 master heartbeat: [4558]: info: Local Resource acquisition completed.
harc(default)[4735]:     2015/11/15_04:31:11 info: Running /etc/ha.d//rc.d/ip-request-resp ip-request-resp
ip-request-resp(default)[4735]:     2015/11/15_04:31:11 received ip-request-resp 192.168.0.110/24/eth0:0 OK yes
ResourceManager(default)[4758]:     2015/11/15_04:31:11 info: Acquiring resource group: master 192.168.0.110/24/eth0:0 nginx
/usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_192.168.0.110)[4786]:     2015/11/15_04:31:11 INFO:  Resource is stopped
ResourceManager(default)[4758]:     2015/11/15_04:31:11 info: Running /etc/ha.d/resource.d/IPaddr 192.168.0.110/24/eth0:0 start
IPaddr(IPaddr_192.168.0.110)[4919]:     2015/11/15_04:31:11 INFO: Adding inet address 192.168.0.110/24 with broadcast address 192.168.0.255 to device eth0 (with label eth0:0)
IPaddr(IPaddr_192.168.0.110)[4919]:     2015/11/15_04:31:11 INFO: Bringing device eth0 up
IPaddr(IPaddr_192.168.0.110)[4919]:     2015/11/15_04:31:11 INFO: /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p /var/run/resource-agents/send_arp-192.168.0.110 eth0 192.168.0.110 auto not_used not_used
/usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_192.168.0.110)[4893]:     2015/11/15_04:31:11 INFO:  Success
ResourceManager(default)[4758]:     2015/11/15_04:31:11 info: Running /etc/init.d/nginx  start
Nov 15 04:31:21 master heartbeat: [4546]: info: Local Resource acquisition completed. (none)
Nov 15 04:31:21 master heartbeat: [4546]: info: local resource transition completed.
Nov 15 04:34:06 master heartbeat: [4546]: info: Link slave:eth0 up.
Nov 15 04:34:06 master heartbeat: [4546]: info: Status update for node slave: status init
Nov 15 04:34:06 master heartbeat: [4546]: info: Status update for node slave: status up
Nov 15 04:34:06 master ipfail: [4556]: info: Link Status update: Link slave/eth0 now has status up
Nov 15 04:34:06 master ipfail: [4556]: info: Status update: Node slave now has status init
Nov 15 04:34:06 master ipfail: [4556]: info: Status update: Node slave now has status up
harc(default)[5865]:     2015/11/15_04:34:06 info: Running /etc/ha.d//rc.d/status status
harc(default)[5886]:     2015/11/15_04:34:06 info: Running /etc/ha.d//rc.d/status status
Nov 15 04:34:07 master heartbeat: [4546]: info: Status update for node slave: status active
Nov 15 04:34:07 master ipfail: [4556]: info: Status update: Node slave now has status active
harc(default)[6033]:     2015/11/15_04:34:07 info: Running /etc/ha.d//rc.d/status status
Nov 15 04:34:07 master heartbeat: [4546]: info: remote resource transition completed.
Nov 15 04:34:07 master heartbeat: [4546]: info: master wants to go standby [foreign]
Nov 15 04:34:08 master heartbeat: [4546]: info: standby: slave can take our foreign resources
Nov 15 04:34:08 master heartbeat: [6319]: info: give up foreign HA resources (standby).
Nov 15 04:34:08 master heartbeat: [6319]: info: foreign HA resource release completed (standby).
Nov 15 04:34:08 master heartbeat: [4546]: info: Local standby process completed [foreign].
Nov 15 04:34:08 master ipfail: [4556]: info: Asking other side for ping node count.
Nov 15 04:34:08 master heartbeat: [4546]: WARN: 1 lost packet(s) for [slave] [11:13]
Nov 15 04:34:08 master heartbeat: [4546]: info: remote resource transition completed.
Nov 15 04:34:08 master heartbeat: [4546]: info: No pkts missing from slave!
Nov 15 04:34:08 master heartbeat: [4546]: info: Other node completed standby takeover of foreign resources.
Nov 15 04:34:19 master ipfail: [4556]: info: No giveup timer to abort.
Nov 15 04:34:24 master heartbeat: [4546]: info: slave wants to go standby [foreign]
Nov 15 04:34:24 master heartbeat: [4546]: info: standby: acquire [foreign] resources from slave
Nov 15 04:34:24 master heartbeat: [8384]: info: acquire local HA resources (standby).
ResourceManager(default)[8397]:     2015/11/15_04:34:24 info: Acquiring resource group: master 192.168.0.110/24/eth0:0 nginx
/usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_192.168.0.110)[8425]:     2015/11/15_04:34:24 INFO:  Running OK
Nov 15 04:34:24 master heartbeat: [8384]: info: local HA resource acquisition completed (standby).
Nov 15 04:34:24 master heartbeat: [4546]: info: Standby resource acquisition done [foreign].
Nov 15 04:34:25 master heartbeat: [4546]: info: remote resource transition completed.
Nov 15 05:03:46 master heartbeat: [4546]: WARN: node 192.168.0.1: is dead
Nov 15 05:03:46 master heartbeat: [4546]: info: Link 192.168.0.1:192.168.0.1 dead.
Nov 15 05:03:46 master ipfail: [4556]: info: Status update: Node 192.168.0.1 now has status dead
harc(default)[12490]:     2015/11/15_05:03:46 info: Running /etc/ha.d//rc.d/status status
Nov 15 05:03:47 master ipfail: [4556]: info: NS: We are dead. :<
Nov 15 05:03:47 master ipfail: [4556]: info: Link Status update: Link 192.168.0.1/192.168.0.1 now has status dead
Nov 15 05:03:48 master ipfail: [4556]: info: We are dead. :<
Nov 15 05:03:48 master ipfail: [4556]: info: Asking other side for ping node count.
Nov 15 05:03:50 master ipfail: [4556]: info: Giving up because we were told that we have less ping nodes.
Nov 15 05:03:50 master ipfail: [4556]: info: Delayed giveup in 4 seconds.
Nov 15 05:03:54 master ipfail: [4556]: info: giveup() called (timeout worked)
Nov 15 05:03:55 master heartbeat: [4546]: info: master wants to go standby [all]
Nov 15 05:03:55 master heartbeat: [4546]: info: standby: slave can take our all resources
Nov 15 05:03:55 master heartbeat: [12516]: info: give up all HA resources (standby).
ResourceManager(default)[12529]:     2015/11/15_05:03:55 info: Releasing resource group: master 192.168.0.110/24/eth0:0 nginx
ResourceManager(default)[12529]:     2015/11/15_05:03:55 info: Running /etc/init.d/nginx  stop
ResourceManager(default)[12529]:     2015/11/15_05:03:55 info: Running /etc/ha.d/resource.d/IPaddr 192.168.0.110/24/eth0:0 stop
IPaddr(IPaddr_192.168.0.110)[12619]:     2015/11/15_05:03:55 INFO: IP status = ok, IP_CIP=
/usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_192.168.0.110)[12593]:     2015/11/15_05:03:55 INFO:  Success
Nov 15 05:03:55 master heartbeat: [12516]: info: all HA resource release completed (standby).
Nov 15 05:03:55 master heartbeat: [4546]: info: Local standby process completed [all].
Nov 15 05:03:57 master heartbeat: [4546]: WARN: 1 lost packet(s) for [slave] [916:918]
Nov 15 05:03:57 master heartbeat: [4546]: info: remote resource transition completed.
Nov 15 05:03:57 master heartbeat: [4546]: info: No pkts missing from slave!
Nov 15 05:03:57 master heartbeat: [4546]: info: Other node completed standby takeover of all resources.

此时从上nginx已经启动了

[root@slave html]# ps aux|grep nginx
root     11932  0.0  0.1  15300  1456 ?        Ss   05:03   0:00 nginx: master process /usr/sbin/nginx -c /etc/nginx/nginx.conf
nginx    11934  0.0  0.1  15456  1908 ?        S    05:03   0:00 nginx: worker process                   
root     11938  0.0  0.0   4420   752 pts/1    S+   05:06   0:00 grep nginx

看从机的ha日志

[root@slave html]# cat /var/log/ha-log

Nov 15 04:34:06 slave heartbeat: [11382]: info: Pacemaker support: false
Nov 15 04:34:06 slave heartbeat: [11382]: WARN: Logging daemon is disabled --enabling logging daemon is recommended
Nov 15 04:34:06 slave heartbeat: [11382]: info: **************************
Nov 15 04:34:06 slave heartbeat: [11382]: info: Configuration validated. Starting heartbeat 3.0.4
Nov 15 04:34:06 slave heartbeat: [11383]: info: heartbeat: version 3.0.4
Nov 15 04:34:06 slave heartbeat: [11383]: WARN: No Previous generation - starting at 1447590847
Nov 15 04:34:06 slave heartbeat: [11383]: info: Heartbeat generation: 1447590847
Nov 15 04:34:06 slave heartbeat: [11383]: info: No uuid found for current node - generating a new uuid.
Nov 15 04:34:06 slave heartbeat: [11383]: info: Creating FIFO /var/lib/heartbeat/fifo.
Nov 15 04:34:06 slave heartbeat: [11383]: info: glib: ucast: write socket priority set to IPTOS_LOWDELAY on eth0
Nov 15 04:34:06 slave heartbeat: [11383]: info: glib: ucast: bound send socket to device: eth0
Nov 15 04:34:06 slave heartbeat: [11383]: info: glib: ucast: set SO_REUSEPORT(w)
Nov 15 04:34:06 slave heartbeat: [11383]: info: glib: ucast: bound receive socket to device: eth0
Nov 15 04:34:06 slave heartbeat: [11383]: info: glib: ucast: set SO_REUSEPORT(w)
Nov 15 04:34:06 slave heartbeat: [11383]: info: glib: ucast: started on port 694 interface eth0 to 192.168.0.107
Nov 15 04:34:06 slave heartbeat: [11383]: info: glib: ping heartbeat started.
Nov 15 04:34:06 slave heartbeat: [11383]: info: G_main_add_TriggerHandler: Added signal manual handler
Nov 15 04:34:06 slave heartbeat: [11383]: info: G_main_add_TriggerHandler: Added signal manual handler
Nov 15 04:34:06 slave heartbeat: [11383]: info: G_main_add_SignalHandler: Added signal handler for signal 17
Nov 15 04:34:06 slave heartbeat: [11383]: info: Local status now set to: 'up'
Nov 15 04:34:06 slave heartbeat: [11383]: info: Link master:eth0 up.
Nov 15 04:34:06 slave heartbeat: [11383]: info: Link 192.168.0.1:192.168.0.1 up.
Nov 15 04:34:06 slave heartbeat: [11383]: info: Status update for node 192.168.0.1: status ping
Nov 15 04:34:06 slave heartbeat: [11383]: info: Status update for node master: status active
harc(default)[11393]:     2015/11/15_04:34:06 info: Running /etc/ha.d//rc.d/status status
Nov 15 04:34:07 slave heartbeat: [11383]: info: Comm_now_up(): updating status to active
Nov 15 04:34:07 slave heartbeat: [11383]: info: Local status now set to: 'active'
Nov 15 04:34:07 slave heartbeat: [11383]: info: Starting child client "/usr/lib/heartbeat/ipfail" (494,491)
Nov 15 04:34:07 slave heartbeat: [11411]: info: Starting "/usr/lib/heartbeat/ipfail" as uid 494  gid 491 (pid 11411)
Nov 15 04:34:07 slave heartbeat: [11383]: info: remote resource transition completed.
Nov 15 04:34:07 slave heartbeat: [11383]: info: remote resource transition completed.
Nov 15 04:34:07 slave heartbeat: [11383]: info: Local Resource acquisition completed. (none)
Nov 15 04:34:08 slave heartbeat: [11383]: info: master wants to go standby [foreign]
Nov 15 04:34:08 slave heartbeat: [11383]: info: standby: acquire [foreign] resources from master
Nov 15 04:34:08 slave heartbeat: [11414]: info: acquire local HA resources (standby).
Nov 15 04:34:08 slave heartbeat: [11414]: info: local HA resource acquisition completed (standby).
Nov 15 04:34:08 slave heartbeat: [11383]: info: Standby resource acquisition done [foreign].
Nov 15 04:34:08 slave heartbeat: [11383]: info: Initial resource acquisition complete (auto_failback)
Nov 15 04:34:09 slave heartbeat: [11383]: info: remote resource transition completed.
Nov 15 04:34:18 slave ipfail: [11411]: info: Ping node count is balanced.
Nov 15 04:34:18 slave ipfail: [11411]: info: Giving up foreign resources (auto_failback).
Nov 15 04:34:18 slave ipfail: [11411]: info: Delayed giveup in 4 seconds.
Nov 15 04:34:22 slave ipfail: [11411]: info: giveup() called (timeout worked)
Nov 15 04:34:23 slave heartbeat: [11383]: info: slave wants to go standby [foreign]
Nov 15 04:34:24 slave heartbeat: [11383]: info: standby: master can take our foreign resources
Nov 15 04:34:24 slave heartbeat: [11428]: info: give up foreign HA resources (standby).
ResourceManager(default)[11441]:     2015/11/15_04:34:24 info: Releasing resource group: master 192.168.0.110/24/eth0:0 nginx
ResourceManager(default)[11441]:     2015/11/15_04:34:24 info: Running /etc/init.d/nginx  stop
ResourceManager(default)[11441]:     2015/11/15_04:34:24 info: Running /etc/ha.d/resource.d/IPaddr 192.168.0.110/24/eth0:0 stop
IPaddr(IPaddr_192.168.0.110)[11530]:     2015/11/15_04:34:24 INFO: IP status = no, IP_CIP=
/usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_192.168.0.110)[11504]:     2015/11/15_04:34:24 INFO:  Success
Nov 15 04:34:24 slave heartbeat: [11428]: info: foreign HA resource release completed (standby).
Nov 15 04:34:24 slave heartbeat: [11383]: info: Local standby process completed [foreign].
Nov 15 04:34:25 slave heartbeat: [11383]: WARN: 1 lost packet(s) for [master] [151:153]
Nov 15 04:34:25 slave heartbeat: [11383]: info: remote resource transition completed.
Nov 15 04:34:25 slave heartbeat: [11383]: info: No pkts missing from master!
Nov 15 04:34:25 slave heartbeat: [11383]: info: Other node completed standby takeover of foreign resources.
Nov 15 05:03:50 slave ipfail: [11411]: info: Telling other node that we have more visible ping nodes.
Nov 15 05:03:55 slave heartbeat: [11383]: info: master wants to go standby [all]
Nov 15 05:03:56 slave heartbeat: [11383]: info: standby: acquire [all] resources from master
Nov 15 05:03:56 slave heartbeat: [11628]: info: acquire all HA resources (standby).
ResourceManager(default)[11641]:     2015/11/15_05:03:56 info: Acquiring resource group: master 192.168.0.110/24/eth0:0 nginx
/usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_192.168.0.110)[11669]:     2015/11/15_05:03:56 INFO:  Resource is stopped
ResourceManager(default)[11641]:     2015/11/15_05:03:56 info: Running /etc/ha.d/resource.d/IPaddr 192.168.0.110/24/eth0:0 start
IPaddr(IPaddr_192.168.0.110)[11802]:     2015/11/15_05:03:56 INFO: Adding inet address 192.168.0.110/24 with broadcast address 192.168.0.255 to device eth0 (with label eth0:0)
IPaddr(IPaddr_192.168.0.110)[11802]:     2015/11/15_05:03:56 INFO: Bringing device eth0 up
IPaddr(IPaddr_192.168.0.110)[11802]:     2015/11/15_05:03:57 INFO: /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p /var/run/resource-agents/send_arp-192.168.0.110 eth0 192.168.0.110 auto not_used not_used
/usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_192.168.0.110)[11776]:     2015/11/15_05:03:57 INFO:  Success
ResourceManager(default)[11641]:     2015/11/15_05:03:57 info: Running /etc/init.d/nginx  start
Nov 15 05:03:57 slave heartbeat: [11628]: info: all HA resource acquisition completed (standby).
Nov 15 05:03:57 slave heartbeat: [11383]: info: Standby resource acquisition done [all].
Nov 15 05:03:58 slave heartbeat: [11383]: info: remote resource transition completed.

 此时在访问页面

HA集群配置 (nginx)_HA集群配置_14

就出现了从机上的nginx的初始页面。

从机上也启动了流动了ip了。

[root@slave html]# ifconfig
eth0      Link encap:Ethernet  HWaddr 00:50:56:23:54:84  
          inet addr:192.168.0.108  Bcast:192.168.0.255  Mask:255.255.255.0
          inet6 addr: fe80::250:56ff:fe23:5484/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:412763 errors:0 dropped:0 overruns:0 frame:0
          TX packets:23054 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:58656442 (55.9 MiB)  TX bytes:2146344 (2.0 MiB)
          Interrupt:19 Base address:0x2000 

eth0:0    Link encap:Ethernet  HWaddr 00:50:56:23:54:84  
          inet addr:192.168.0.110  Bcast:192.168.0.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          Interrupt:19 Base address:0x2000 

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:20 errors:0 dropped:0 overruns:0 frame:0
          TX packets:20 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:1320 (1.2 KiB)  TX bytes:1320 (1.2 KiB)

测试2
主上打开ping(之前设置了禁ping)

[root@master html]# iptables -D INPUT -p icmp -j DROP

此时从机上的流动IP就过会就释放掉了

[root@slave html]# ifconfig
eth0      Link encap:Ethernet  HWaddr 00:50:56:23:54:84  
          inet addr:192.168.0.108  Bcast:192.168.0.255  Mask:255.255.255.0
          inet6 addr: fe80::250:56ff:fe23:5484/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:415798 errors:0 dropped:0 overruns:0 frame:0
          TX packets:23326 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:58868335 (56.1 MiB)  TX bytes:2191930 (2.0 MiB)
          Interrupt:19 Base address:0x2000 

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:20 errors:0 dropped:0 overruns:0 frame:0
          TX packets:20 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:1320 (1.2 KiB)  TX bytes:1320 (1.2 KiB)

[root@slave html]# ps aux|grep nginx
root     12120  0.0  0.0   4420   756 pts/1    S+   05:15   0:00 grep nginx

此时也关闭了nginx

此时主机上也启动了nginx,且获取到了流动IP

[root@master html]# ps aux|grep nginx
root     13023  0.0  0.1  15300  1448 ?        Ss   05:14   0:00 nginx: master process /usr/sbin/nginx -c /etc/nginx/nginx.conf
nginx    13025  0.0  0.1  15456  1900 ?        S    05:14   0:00 nginx: worker process                   
root     13029  0.0  0.0   4420   752 pts/1    S+   05:15   0:00 grep nginx

[root@master html]# ifconfig
eth0      Link encap:Ethernet  HWaddr 00:50:56:3E:EC:CF  
          inet addr:192.168.0.107  Bcast:192.168.0.255  Mask:255.255.255.0
          inet6 addr: fe80::250:56ff:fe3e:eccf/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:439124 errors:1 dropped:0 overruns:0 frame:0
          TX packets:28841 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:70228235 (66.9 MiB)  TX bytes:2727948 (2.6 MiB)
          Interrupt:19 Base address:0x2000 

eth0:0    Link encap:Ethernet  HWaddr 00:50:56:3E:EC:CF  
          inet addr:192.168.0.110 
 Bcast:192.168.0.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          Interrupt:19 Base address:0x2000 

eth0:1    Link encap:Ethernet  HWaddr 00:50:56:3E:EC:CF  
          inet addr:192.168.0.109  Bcast:192.168.0.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          Interrupt:19 Base address:0x2000 

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:20 errors:0 dropped:0 overruns:0 frame:0
          TX packets:20 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:1320 (1.2 KiB)  TX bytes:1320 (1.2 KiB)

此时在去浏览器访问

HA集群配置 (nginx)_HA集群配置_15

又回来主机上nginx上了。

12. 测试2
主上停止heartbeat服务
service heartbeat stop 

主机:

[root@master html]# service heartbeat stop
Stopping High-Availability services: Done.

[root@master html]# ifconfig
eth0      Link encap:Ethernet  HWaddr 00:50:56:3E:EC:CF  
          inet addr:192.168.0.107  Bcast:192.168.0.255  Mask:255.255.255.0
          inet6 addr: fe80::250:56ff:fe3e:eccf/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:442173 errors:1 dropped:0 overruns:0 frame:0
          TX packets:29100 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:70445466 (67.1 MiB)  TX bytes:2768155 (2.6 MiB)
          Interrupt:19 Base address:0x2000 

eth0:1    Link encap:Ethernet  HWaddr 00:50:56:3E:EC:CF  
          inet addr:192.168.0.109  Bcast:192.168.0.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          Interrupt:19 Base address:0x2000 

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:20 errors:0 dropped:0 overruns:0 frame:0
          TX packets:20 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:1320 (1.2 KiB)  TX bytes:1320 (1.2 KiB)

从机很快反应到了且启动了nginx

[root@slave html]# ps aux|grep nginx
root     12482  0.0  0.1  15300  1460 ?        Ss   05:19   0:00 nginx: master process /usr/sbin/nginx -c /etc/nginx/nginx.conf
nginx    12484  0.0  0.1  15456  1912 ?        S    05:19   0:00 nginx: worker process                   
root     12501  0.0  0.0   4420   756 pts/1    S+   05:19   0:00 grep nginx

再去访问浏览器。此时又返回到从机上了。

HA集群配置 (nginx)_HA集群配置_16