keepalived实现nginx高可用
1、环境说明
IP | 服务 | 作用 |
---|---|---|
192.168.1.101 | nginx + keepalived | master |
192.168.1.102 | nginx + keepalived | backup |
192.168.1.103 | 虚拟ip(VIP) |
- 说明: 系统:CentOS 6.10 master配一个,backup可以配置多个; 虚拟ip(VIP):192.168.1.103,对外提供服务的ip,也可称作浮动ip
各个组件之间的关系图如下: tomcat的安装不在本博客范围之内;
2、nginx 安装与配置
2.1、安装nginx
master和backup所有节点都安装 配置nginx官方源
vim /etc/yum.repos.d/nginx.repo
添加如下内容:
[nginx]
name=nginx repo
baseurl=http://nginx.org/packages/centos/$releasever/$basearch/
gpgcheck=0
enabled=1
安装
yum install nginx -y
2.2、master节点配置
2.2.1、删除没用的配置内容(可选)
vim /etc/nginx/conf.d/default.conf
改为如下:
server {
listen 80;
server_name localhost;
access_log /var/log/nginx/host.access.log main;
location / {
root /usr/share/nginx/html;
index index.html index.htm;
}
error_page 404 /404.html;
error_page 500 502 503 504 /50x.html;
location = /50x.html {
root /usr/share/nginx/html;
}
}
2.2.2、修改nginx默认显示内容
vim /usr/share/nginx/html/index.html
只修改第14行内容,如下:
1 <!DOCTYPE html>
2 <html>
3 <head>
4 <title>Welcome to nginx!</title>
5 <style>
6 body {
7 width: 35em;
8 margin: 0 auto;
9 font-family: Tahoma, Verdana, Arial, sans-serif;
10 }
11 </style>
12 </head>
13 <body>
14 Welcome to nginx! test keepalived master!
15 <p>If you see this page, the nginx web server is successfully installed and
16 working. Further configuration is required.</p>
17
18 <p>For online documentation and support please refer to
19 <a rel="nofollow" href="http://nginx.org/">nginx.org</a>.<br/>
20 Commercial support is available at
21 <a rel="nofollow" href="http://nginx.com/">nginx.com</a>.</p>
22
23 <p><em>Thank you for using nginx.</em></p>
24 </body>
25 </html>
2.3、backup节点配置
只把/usr/share/nginx/html/index.html
的第14行改为如下,其它和master一致。
Welcome to nginx! test keepalived backup!
3、keepalived服务
3.1、keepalived 是什么?
Keepalived 一方面具有配置管理LVS的功能,同时还具有对LVS下面节点进行健康检查的功能,另一方面也可实现系统网络服务的高可用功能,用来防止单点故障。
3.2、keepalived 工作原理
keepalived 是以 VRRP 协议为实现基础,VRRP全称Virtual Router Redundancy Protocol,即虚拟路由冗余协议。 虚拟路由冗余协议,可以认为是实现路由器高可用的协议,即将N台提供相同功能的路由器组成一个路由器组,这个组里面有一个master和多个backup,master上面有一个对外提供服务的vip(该路由器所在局域网内其他机器的默认路由为该vip),master会发组播vrrp包,用于通知backup自己还活着,当backup收不到vrrp包时就认为master宕掉了,这时就需要根据VRRP的优先级来选举一个backup当master。这样的话就可以保证路由器的高可用了。保证业务的连续性,接管速度最快可以小于1秒。
3.3、keepalived主要有三个模块,分别是core、check和vrrp。
core模块为keepalived的核心,负责主进程的启动、维护以及全局配置文件的加载和解析。
check负责健康检查,包括常见的各种检查方式。
vrrp模块是来实现VRRP协议的。
3.4、keepalived 与 zookeeper 高可用性区别
- Keepalived:
- 优点:简单,基本不需要业务层面做任何事情,就可以实现高可用,主备容灾。而且容灾的宕机时间也比较短。
- 缺点:也是简单,因为VRRP、主备切换都没有什么复杂的逻辑,所以无法应对某些特殊场景,比如主备通信链路出问题,会导致脑裂。同时keepalived也不容易做负载均衡。
- Zookeeper:
- 优点:可以支持高可用,负载均衡。本身是个分布式的服务。
- 缺点:跟业务结合的比较紧密。需要在业务代码中写好ZK使用的逻辑,比如注册名字。拉取名字对应的服务地址等。
4、keepalived 配置
4.1、keepalived 安装
master和backup所有节点都安装
[root@node1 ~]# yum install keepalived -y
[root@node1 ~]# rpm -ql keepalived
/etc/keepalived
/etc/keepalived/keepalived.conf # keepalived服务主配置文件
/etc/rc.d/init.d/keepalived # 服务启动脚本(centos 7 之前的用init.d 脚本启动,之后的systemd启动)
/etc/sysconfig/keepalived
/usr/bin/genhash
/usr/libexec/keepalived
/usr/sbin/keepalived
/usr/share/doc/keepalived-1.2.13
... ...
/usr/share/man/man1/genhash.1.gz
/usr/share/man/man5/keepalived.conf.5.gz
/usr/share/man/man8/keepalived.8.gz
/usr/share/snmp/mibs/KEEPALIVED-MIB.txt
4.2、默认配置及说明
[root@node1 keepalived]# cat /etc/keepalived/keepalived.conf
! Configuration File for keepalived # 全局定义
global_defs {
notification_email { # 指定keepalived在发生事件时(比如切换)发送通知邮件的邮箱
acassen@firewall.loc # 设置报警邮件地址,可以设置多个,每行一个。 需开启本机的sendmail服务
failover@firewall.loc
sysadmin@firewall.loc
}
notification_email_from Alexandre.Cassen@firewall.loc # keepalived在发生诸如切换操作时需要发送email通知地址
smtp_server 192.168.200.1 # 指定发送email的smtp服务器
smtp_connect_timeout 30 # 设置连接smtp server的超时时间
router_id LVS_DEVEL # 运行keepalived的机器的一个标识,通常可设为hostname。故障发生时,发邮件时显示在邮件主题中的信息。
vrrp_skip_check_adv_addr
vrrp_strict
vrrp_garp_interval 0
vrrp_gna_interval 0
}
<br># 虚拟 IP 配置 vrrp
vrrp_instance VI_1 { # keepalived在同一virtual_router_id中priority(0-255)最大的会成为master,也就是接管VIP,当priority最大的主机发生故障后次priority将会接管
state MASTER
# 指定keepalived的角色,MASTER表示此主机是主服务器,BACKUP表示此主机是备用服务器。
# 注意这里的state指定instance(Initial)的初始状态,就是说在配置好后,这台服务器的初始状态就是这里指定的,但这里指定的不算,还是得要通过竞选通过优先级来确定。
# 如果这里设置为MASTER,但如若他的优先级不及另外一台,那么这台在发送通告时,会发送自己的优先级,另外一台发现优先级不如自己的高,那么他会就回抢占为MASTER
interface eth1 # 绑定虚拟 IP 的网络接口,与本机 IP 地址所在的网络接口相同, 我的是 eth1;
virtual_router_id 51 # 虚拟路由标识,这个标识是一个数字,同一个vrrp实例使用唯一的标识。即同一vrrp_instance下,MASTER和BACKUP必须是一致的;
priority 100 # 定义优先级,数字越大,优先级越高,在同一个vrrp_instance下,MASTER的优先级必须大于BACKUP的优先级,值范围 0-254;
advert_int 1 # 设定MASTER与BACKUP负载均衡器之间同步检查的时间间隔,单位是秒;
authentication { # 设置验证类型和密码。主从必须一样;
auth_type PASS # 设置vrrp验证类型,主要有PASS和AH两种;
auth_pass 1111 # #设置vrrp验证密码,在同一个vrrp_instance下,MASTER与BACKUP必须使用相同的密码才能正常通信;
}<br> <br> ## 将 track_script 块加入 instance 配置块 <br> <br> track_script {<br> chk_nginx ## 执行 Nginx 监控的服务 <br> }
virtual_ipaddress { #VRRP HA 虚拟地址 如果有多个VIP,继续换行填写
192.168.200.16
192.168.200.17
192.168.200.18
}
}
virtual_server 192.168.200.100 443 {
delay_loop 6
lb_algo rr
lb_kind NAT
nat_mask 255.255.255.0
persistence_timeout 50
protocol TCP
real_server 192.168.201.100 443 {
weight 1
SSL_GET {
url {
path /
digest ff20ad2481f97b1754ef3e12ecd3a9cc
}
url {
path /mrtg/
digest 9b3a0c85a887a256d6939da88aabd8cd
}
connect_timeout 3
nb_get_retry 3
delay_before_retry 3
}
}
}
virtual_server 10.10.10.2 1358 {
delay_loop 6
lb_algo rr
lb_kind NAT
persistence_timeout 50
protocol TCP
sorry_server 192.168.200.200 1358
real_server 192.168.200.2 1358 {
weight 1
HTTP_GET {
url {
path /testurl/test.jsp
digest 640205b7b0fc66c1ea91c463fac6334d
}
url {
path /testurl2/test.jsp
digest 640205b7b0fc66c1ea91c463fac6334d
}
url {
path /testurl3/test.jsp
digest 640205b7b0fc66c1ea91c463fac6334d
}
connect_timeout 3
nb_get_retry 3
delay_before_retry 3
}
}
real_server 192.168.200.3 1358 {
weight 1
HTTP_GET {
url {
path /testurl/test.jsp
digest 640205b7b0fc66c1ea91c463fac6334c
}
url {
path /testurl2/test.jsp
digest 640205b7b0fc66c1ea91c463fac6334c
}
connect_timeout 3
nb_get_retry 3
delay_before_retry 3
}
}
}
virtual_server 10.10.10.3 1358 {
delay_loop 3
lb_algo rr
lb_kind NAT
nat_mask 255.255.255.0
persistence_timeout 50
protocol TCP
real_server 192.168.200.4 1358 {
weight 1
HTTP_GET {
url {
path /testurl/test.jsp
digest 640205b7b0fc66c1ea91c463fac6334d
}
url {
path /testurl2/test.jsp
digest 640205b7b0fc66c1ea91c463fac6334d
}
url {
path /testurl3/test.jsp
digest 640205b7b0fc66c1ea91c463fac6334d
}
connect_timeout 3
nb_get_retry 3
delay_before_retry 3
}
}
real_server 192.168.200.5 1358 {
weight 1
HTTP_GET {
url {
path /testurl/test.jsp
digest 640205b7b0fc66c1ea91c463fac6334d
}
url {
path /testurl2/test.jsp
digest 640205b7b0fc66c1ea91c463fac6334d
}
url {
path /testurl3/test.jsp
digest 640205b7b0fc66c1ea91c463fac6334d
}
connect_timeout 3
nb_get_retry 3
delay_before_retry 3
}
}
}
4.3、master主负载均衡服务器配置
[root@master ~]# vim /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
router_id LVS_01
}
## keepalived 会定时执行脚本并对脚本执行的结果进行分析,动态调整 vrrp_instance 的优先级。
# 如果脚本执行结果为 0,并且 weight 配置的值大于 0,则优先级相应的增加。
# 如果脚本执行结果非 0,并且 weight配置的值小于 0,则优先级相应的减少。
# 其他情况,维持原本配置的优先级,即配置文件中 priority 对应的值。
vrrp_script chk_nginx {
script "/etc/keepalived/nginx_check.sh" # 检测 nginx 状态的脚本路径
interval 2 # 脚本执行间隔,每2s检测一次
weight -5 # 脚本结果导致的优先级变更,检测失败(脚本返回非0)则优先级 -5
fall 2 # 检测连续2次失败才算确定是真失败。会用weight减少优先级(1-255之间)
rise 1 # 检测1次成功就算成功。但不修改优先级
}
vrrp_instance VI_1 {
state MASTER
interface eth1
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
## 将 track_script 块加入 instance 配置块
track_script { # 执行监控的服务。注意这个设置不能紧挨着写在vrrp_script配置块的后面(实验中碰过的坑),否则nginx监控失效!!
chk_nginx # 引用VRRP脚本,即在 vrrp_script 部分指定的名字。定期运行它们来改变优先级,并最终引发主备切换。
}
virtual_ipaddress {
192.168.1.103
}
}
... ...
[root@master ~]#
4.4、backup备负载均衡服务器配置
[root@slave ~]# vim /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
router_id LVS_02
}
vrrp_script chk_nginx {
script "/etc/keepalived/nginx_check.sh"
interval 3
weight -20
}
vrrp_instance VI_1 {
state SLAVE
interface eth1
virtual_router_id 51
priority 90
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
# 将 track_script 块加入 instance 配置块
track_script {
chk_nginx # 执行 Nginx 监控的服务
}
virtual_ipaddress {
192.168.1.103
}
}
... ...
[root@slave ~]#
5、测试
5.1、编写 nginx 监测脚本
在所有的节点上面编写Nginx
状态检测脚本/etc/keepalived/nginx_check.sh
(已在 keepalived.conf 中配置)
脚本要求:如果 nginx 停止运行,尝试启动,如果无法启动则杀死本机的 keepalived 进程, keepalied将虚拟 ip 绑定到 BACKUP 机器上。
内容如下:
[root@master ~]# vim /etc/keepalived/nginx_check.sh
#!/bin/bash
set -x
nginx_status=`ps -C nginx --no-header |wc -l`
if [ ${nginx_status} -eq 0 ];then
service nginx start
sleep 1
if [ `ps -C nginx --no-header |wc -l` -eq 0 ];then #nginx重启失败
echo -e "$(date): nginx is not healthy, try to killall keepalived!" >> /etc/keepalived/keepalived.log
killall keepalived
fi
fi
echo $?
[root@master ~]# chmod +x /etc/keepalived/nginx_check.sh
[root@master ~]# ll /etc/keepalived/nginx_check.sh
-rwxr-xr-x 1 root root 338 2019-02-15 14:11 /etc/keepalived/nginx_check.sh
5.2、启动所有节点上的nginx和keepalived
- 启动nginx
service nginx start
- 启动keepalived 相关操作命令如下:
chkconfig keepalived on # keepalived服务开机启动
service keepalived start # 启动服务
service keepalived stop # 停止服务
service keepalived restart # 重启服务
keepalived正常运行后,会启动3个进程,其中一个是父进程,负责监控其子进程。一个是vrrp子进程,另外一个是checkers子进程。
[root@master ~]# ps -ef | grep keepalived
root 3653 1 0 14:18 ? 00:00:00 /usr/sbin/keepalived -D
root 3654 3653 0 14:18 ? 00:00:02 /usr/sbin/keepalived -D
root 3655 3653 0 14:18 ? 00:00:03 /usr/sbin/keepalived -D
root 7481 3655 0 15:19 ? 00:00:00 /usr/sbin/keepalived -D
root 7483 1323 0 15:19 pts/0 00:00:00 grep --color=auto keepalived
[root@master ~]#
5.3、master主负载均衡服务器IP
信息:192.168.1.101
[root@master ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
link/ether b0:51:8e:01:9b:b0 brd ff:ff:ff:ff:ff:ff
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:20:ae:75 brd ff:ff:ff:ff:ff:ff
inet 192.168.1.101/24 brd 192.168.1.255 scope global eth1
inet 192.168.1.103/32 scope global eth1
inet6 fe80::20c:29ff:fe20:ae75/64 scope link
valid_lft forever preferred_lft forever
[root@master ~]#
5.4、backup备负载均衡服务器查看IP
信息:192.168.1.102
[root@slave ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
link/ether b0:51:8e:01:9b:b0 brd ff:ff:ff:ff:ff:ff
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:7d:6a:24 brd ff:ff:ff:ff:ff:ff
inet 192.168.1.102/24 brd 192.168.1.255 scope global eth1
inet6 fe80::20c:29ff:fe7d:6a24/64 scope link
valid_lft forever preferred_lft forever
[root@slave ~]#
以上可以看到,虚拟ip(VIP)生效是在192.168.1.101
服务器上。
5.5、测试
通过VIP(192.168.1.103)来访问nginx,结果如下:
以上可知,现在生效的nginx代理机器是1.101;我们停掉机器1.101上面的keepalived
[root@master ~]# service keepalived stop
停止 keepalived: [确定]
再使用VIP(192.168.1.103)访问nginx服务,结果如下: 以上可知,现在生效的nginx代理机器是1.102;我们重启机器1.101上面的keepalived
[root@master ~]# service keepalived start
正在启动 keepalived: [确定]
再使用VIP(192.168.1.103)访问nginx服务,结果如下:
停止 nginx ;查看 nginx 监测脚本是否有效:
[root@master ~]# service nginx status
nginx (pid 19617) 正在运行...
[root@master ~]# service nginx stop
停止 nginx: [确定]
[root@master ~]# service nginx status
nginx (pid 23595) 正在运行...
[root@master ~]#
至此,Keepalived + Nginx 实现高可用 Web 负载均衡搭建完毕!
5.6、keepalived服务监测脚本
由于keepalived服务也可能停止,可以写一个keepalived服务检测脚本并添加到定时任务里; 所有服务器都要操作;
[root@master ~]# vim /opt/scripts/keepalived_monitor.sh
#!/bin/bash
set -x
keepalived_status=`ps -C keepalived --no-header |wc -l`
if [ ${keepalived_status} -eq 0 ];then
echo -e "$(date): keepalived is not healthy!\n" >> /etc/keepalived/keepalived.log
service keepalived start
sleep 1
if [ `ps -C keepalived --no-header |wc -l` -eq 0 ];then #nginx重启失败
echo -e "$(date): try to restart keepalived failure!\n" >> /etc/keepalived/keepalived.log
fi
fi
echo $?
[root@master ~]# chmod +x /opt/scripts/keepalived_monitor.sh
[root@master ~]# echo "* * * * * /opt/scripts/keepalived_monitor.sh > /dev/null 2>&1" >> /var/spool/cron/root #
[root@master ~]# crontab -l
*/30 * * * * /usr/sbin/ntpdate ntp1.aliyun.com > /dev/null 2>&1;/sbin/hwclock -w
* * * * * /opt/scripts/keepalived_monitor.sh > /dev/null 2>&1
[root@master ~]#
5.7 keepalived日志
默认日志存放在系统日志:/var/log/messages下,如果无法主备切换,可以查看日志分析;
tail -100f /var/log/messages
6、报错总结
6.1、同一个网段内所有服务器virtual_router_id设置相同的后果
$ tail -1000f /var/log/messages |grep VRRP
Jun 12 17:06:31 gj-dev-192-168-145-112 Keepalived_vrrp[14755]: bogus VRRP packet received on eth0 !!!
Jun 12 17:06:31 gj-dev-192-168-145-112 Keepalived_vrrp[14755]: VRRP_Instance(VIP_W_G2) ignoring received advertisment...
Jun 12 17:06:32 gj-dev-192-168-145-112 Keepalived_vrrp[14755]: bogus VRRP packet received on eth0 !!!
Jun 12 17:06:32 gj-dev-192-168-145-112 Keepalived_vrrp[14755]: VRRP_Instance(VIP_W_G2) ignoring received advertisment...
Jun 12 17:06:33 gj-dev-192-168-145-112 Keepalived_vrrp[14755]: bogus VRRP packet received on eth0 !!!
Jun 12 17:06:33 gj-dev-192-168-145-112 Keepalived_vrrp[14755]: VRRP_Instance(VIP_W_G2) ignoring received advertisment...
参考
https://www.cnblogs.com/kevingrace/p/6138185.html nginx官方文档 keepalived官方文档