openstack负载均衡及高可用配置
1、RabbitMQ集群配置
1.1、rabbitmq介绍
RabbitMQ是用erlang开发的,集群非常方便,因为erlang天生就是一门分布式语言。
RabbitMQ的集群节点包括内存节点、磁盘节点。顾名思义内存节点就是将所有数据放在内存,磁盘节点将数据放在磁盘。不过,如前文所述,如果在投递消息时,打开了消息的持久化,那么即使是内存节点,数据还是安全的放在磁盘。
良好的设计架构可以如下:在一个集群里,有3台机器,其中1台使用磁盘模式,另2台使用内存模式。2台内存模式的节点,无疑速度更快,因此客户端(consumer、producer)连接访问它们。而磁盘模式的节点,由于磁盘IO相对较慢,因此仅作数据备份使用。
1.2、rabbiqmq集群配置
配置RabbitMQ集群非常简单,就几个命令。2台测试机器,hostname分别是controller1和controller2,配置步骤如下:
在两台机的/etc/hosts文件里,指定controller1和controller2的hosts,如:
1. 192.168.1.126 controller1
2. 192.168.1.127 controller2
还有/etc/hostname文件也要正确,分别是controller1和controller2。
请注意RabbitMQ集群节点必须在同一个网段里,如果是跨广域网效果就差。
在两台机上都安装和启动RabbitMQ。
A newer version of RabbitMQ Server is required for the proper operation of clustering with OpenStack services,download rabbitmq-server-2.8.7
# wget -O /tmp/rabbitmq-server_2.8.7-1_all.deb http://www.rabbitmq.com/releases/rabbitmq-server/v2.8.7/rabbitmq-server_2.8.7-1_all.deb –no-check-certificate
安装依赖
# apt-get install -y erlang-nox
安装rabbitmq
# dpkg -i /tmp/rabbitmq-server_2.8.7-1_all.deb
配置rabbitmq集群:
首先需要停掉所有节点的rabbitmq服务
# service rabbitmq-server stop
保证所有节点有一样的Erlang cookie,将cookie从controller1拷贝至controller2(保证文件权限保持一致)
# scp /var/lib/rabbitmq/.erlang.cookie controller2:/var/lib/rabbitmq/.erlang.cookie
然后启动所有节点服务
# service rabbitmq-server start
在controller2节点上执行:
rabbitmqctl cluster rabbit@controller1
上述命令先停掉rabbitmq应用,reset集群状态,然后调用cluster命令,将controller2连接到controller1,使两者成为一个集群,最后重启rabbitmq应用。在这个cluster命令下,controller2是内存节点,controller1是磁盘节点(RabbitMQ启动后,默认是磁盘节点)。
如果要使controller2在集群里也是磁盘节点,那么更改上述第3句如下:
# rabbitmqctl cluster rabbit@controller1 rabbit@controller2
只要在节点列表里包含了自己,它就成为一个磁盘节点。在RabbitMQ集群里,必须至少有一个磁盘节点存在。
在controller2和controller1上,运行cluster_status命令查看集群状态:
root@controller1:~# rabbitmqctl cluster_status
Cluster status of node rabbit@controller1 ...
[{nodes,[{disc,[rabbit@controller1]},{ram,[rabbit@controller2]}]},
{running_nodes,[rabbit@controller2,rabbit@controller1]}]
root@controller2:~# rabbitmqctl cluster_status
Cluster status of node rabbit@controller2 ...
[{nodes,[{disc,[rabbit@controller1]},{ram,[rabbit@controller2]}]},
{running_nodes,[rabbit@controller1,rabbit@controller2]}]
我们看到,controller2和controller1都是磁盘节点(disc),并且都是在运行中的节点(running_nodes)。
往一台集群节点里写入消息队列,会复制到另一个节点上,我们看到两个节点的消息队列数一致:
root@controller2:~# rabbitmqctl list_queues -p /pyhtest
root@controller1:~# rabbitmqctl list_queues -p /pyhtes
这样RabbitMQ集群就正常工作了。
如果是3台配置集群,比如host1、host2、host3,那么只要保证host2与host1连通,host3与host1连通就行,host2与host3会自动连通,具体配置方法与2台是一致的。
创建openstack rabbitmq用户:
在其中一个节点上执行:
1. # rabbitmqctl delete_user guest
2. # rabbitmqctl add_user openstack_rabbit_user openstack_rabbit_password
3. # rabbitmqctl set_permissions -p / openstack_rabbit_user ".*" ".*" ".*"
确认用户:
1. # rabbitmqctl list_user_permissions openstack_rabbit_user
2、MySQL同步配置
2.1、修改mysql配置文件
将mysql server自己建立的数据库先导出,然后删除;
两台mysql都需要开启binlog日志功能,两台mysql的server-id不能一致,一台server-id=1,一台server-id=2,修改/etc/mysql/my.cnf在[mysqld]段加入
服务器192.168.1.126:
log-bin=mysql-bin
server-id=1
binlog-ignore-db=mysql
#主主需加入的部分
log-slave-updates
sync_binlog=1
auto_increment_offset=1
auto_increment_increment=2
replicate-ignore-db = mysql,information_schema
服务器192.168.1.127:
log-bin=mysql-bin
server-id=2
replicate-ignore-db = mysql,information_schema
#主主需要加入部分
binlog-ignore-db=mysql
log-slave-updates
sync_binlog=1
auto_increment_offset=2
auto_increment_increment=2
重启服务
# service mysql restart
2.2、将192.168.1.126设为192.168.1.127的主服务器
在192.168.1.126上新建授权用户
mysql> grant replication slave on *.* to 'replication'@'%' identified by 'replication';
Query OK, 0 rows affected (0.00 sec)
mysql> flush privileges;
mysql> flush tables with read lock;
mysql> show master status\G
*************************** 1. row ***************************
File: mysql-bin.000002
Position: 4049
Binlog_Do_DB:
Binlog_Ignore_DB: mysql
1 row in set (0.00 sec)
mysql> unlock tables;
Query OK, 0 rows affected (0.00 sec)
在192.168.1.127上将192.168.1.126设置为自己的主服务器
mysql> change master to master_host='192.168.1.126',master_user='replication',master_password='replication',master_log_file='mysql-bin.000002',master_log_pos=4049;
Query OK, 0 rows affected (0.23 sec)
mysql> start slave;
Query OK, 0 rows affected (0.00 sec)
mysql> show slave status\G
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
以上两项都是Yes则表示状态正常;
同样在192.168.1.127上新建授权用户,并且在192.168.1.126上将192.168.1.127设置为自己的主服务器
MySQL同步检测脚本(脚本检测同步sql和IO进程是否都为真,以及select是否达到20个进程以上)
#!/bin/bash
#
# /usr/local/bin/mysqlchk_status.sh
# This script checks if a mysql server is healthy running on localhost. It will
# return:
# "HTTP/1.x 200 OK\r" (if mysql is running smoothly)
# – OR –
# "HTTP/1.x 503 Internal Server Error\r" (else)
#
MYSQL_HOST="localhost"
MYSQL_PORT="3306"
MYSQL_USERNAME="root"
MYSQL_PASSWORD="123123"
#
# We perform a simple query that should return a few results
#/usr/local/mysql/bin/mysql -hlocalhost –urepdb63 –pmylqs9eyex7s -e "show slave status\G;" > /tmp/rep.txt
mysql -u$MYSQL_USERNAME -p$MYSQL_PASSWORD -e "show full processlist;" >/tmp/processlist.txt
mysql -u$MYSQL_USERNAME -p$MYSQL_PASSWORD -e "show slave status\G;" >/tmp/rep.txt
iostat=`grep "Slave_IO_Running" /tmp/rep.txt |awk '{print $2}'`
sqlstat=`grep "Slave_SQL_Running" /tmp/rep.txt |awk '{print $2}'`
result=$(cat /tmp/processlist.txt|wc -l)
#echo iostat:$iostat and sqlstat:$sqlstat
# if slave_IO_Running and Slave_sql_Running ok,then return 200 code
if [ "$result" -lt "20" ] && [ "$iostat" = "Yes" ] && [ "$sqlstat" = "Yes" ];
then
# mysql is fine, return http 200
/bin/echo -e "HTTP/1.1 200 OK\r\n"
else
# mysql is down, return http 503
/bin/echo -e "HTTP/1.1 503 Service Unavailable\r\n"
fi
3、openstack负载均衡配置
这里我们使用HAProxy + Keepalived实现服务的HA load balancing,当然也可以使用其他方案如:HAProxy + Pacemaker + Corosync …
主要针对openstack REST API服务:nova-api,keystone,glance-api,glance-registry,quantum-server,cinder-api, memcache以及rabbitmq及mysql服务做HA&loadbalancing,如果还需要其他服务,你懂的!依葫芦画瓢!
还是使用两台机器,A:192.168.1.126,B:192.168.1.127,虚拟浮动IP:192.168.1.130,所有的IP都设置在eth0上;
3.1、安装必要的软件包
# sudo apt-get install haproxy keepalived
3.2、配置haproxy
haproxy的配置文件为/etc/haproxy/haproxy.cfg
192.168.1.126:
# mkdir /var/lib/haproxy
global
chroot /var/lib/haproxy
daemon
nbproc 8
group haproxy
log 192.168.1.126 local0
maxconn 32768
pidfile /var/run/haproxy.pid
stats socket /var/lib/haproxy/stats
user haproxy
defaults
log global
maxconn 32768
mode http
option redispatch
retries 3
stats enable
timeout http-request 10s
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout check 10s
listen keystone-1 192.168.1.130:5000
balance source
option tcplog
server controller-1 192.168.1.126:5000 check inter 2000 rise 2 fall 5
server controller-2 192.168.1.127:5000 check inter 2000 rise 2 fall 5
listen keystone-2 192.168.1.130:35357
balance source
option tcplog
server controller-1 192.168.1.126:35357 check inter 2000 rise 2 fall 5
server controller-2 192.168.1.127:35357 check inter 2000 rise 2 fall 5
listen nova-api-1 192.168.1.130:8773
balance source
option tcplog
server controller-1 192.168.1.126:8773 check inter 2000 rise 2 fall 5
server controller-2 192.168.1.127:8773 check inter 2000 rise 2 fall 5
listen nova-api-2 192.168.1.130:8774
balance source
option tcplog
server controller-1 192.168.1.126:8774 check inter 2000 rise 2 fall 5
server controller-2 192.168.1.127:8774 check inter 2000 rise 2 fall 5
listen nova-api-3 192.168.1.130:8775
balance source
option tcplog
server controller-1 192.168.1.126:8775 check inter 2000 rise 2 fall 5
server controller-2 192.168.1.127:8775 check inter 2000 rise 2 fall 5
listen cinder-api 192.168.1.130:8776
balance source
option tcplog
server controller1 192.168.1.126:8776 check inter 2000 rise 2 fall 5
server hyson 192.168.1.127:8776 check inter 2000 rise 2 fall 5
listen glance-api 192.168.1.130:9292
balance source
option tcplog
server controller-1 192.168.1.126:9292 check inter 2000 rise 2 fall 5
server controller-2 192.168.1.127:9292 check inter 2000 rise 2 fall 5
listen glance-registry 192.168.1.130:9191
balance source
option tcplog
server controller-1 192.168.1.126:9191 check inter 2000 rise 2 fall 5
server controller-2 192.168.1.127:9191 check inter 2000 rise 2 fall 5
listen quantum-server 192.168.1.130:9696
balance source
option tcplog
server controller-1 192.168.1.126:9696 check inter 2000 rise 2 fall 5
server controller-2 192.168.1.127:9696 check inter 2000 rise 2 fall 5
listen swift-proxy 192.168.1.130:8081
balance source
option tcplog
server controller-1 192.168.1.126:8081 check inter 2000 rise 2 fall 5
server controller-2 192.168.1.127:8081 check inter 2000 rise 2 fall 5
listen memcache 192.168.1.130:11211
balance source
option tcplog
server controller-1 192.168.1.126:11211 check inter 2000 rise 2 fall 5
server controller-2 192.168.1.127:11211 check inter 2000 rise 2 fall 5
拷贝/etc/haproxy/haproxy.cfg到192.168.1.127的相同目录修改
log 192.168.1.127 local0
修改/etc/default/haproxy
ENABLE=1
重启服务
# service haproxy restart
3.3、配置keepalived
在两个节点上新建/etc/keepalived/keepalived.conf配置文件
global_defs {
router_id controller-1
}
vrrp_script haproxy {
script “killall -0 haproxy”
interval 2
weight 2
}
vrrp_instance 50 {
virtual_router_id 50
# for electing MASTER, highest priority wins.
priority 101
state MASTER
interface eth0
virtual_ipaddress {
192.168.1.130
}
track_script {
haproxy
}
}
拷贝/etc/keepalived/keepalived.conf到192.168.1.127的相同目录,修改
router_id controller-2
priority 100
重启服务
# service keepalived restart
3.4、配置openstack服务
以下配置192.168.126节点上修改,在192.168.1.127上执行同样的操作
3.4.1、Nova
修改/etc/nova/nova.conf
glance_api_servers=192.168.1.130:9292
quantum_url=http://192.168.1.130:9696
quantum_admin_auth_url=http://192.168.1.130:35357/v2.0
quantum_connection_host=192.168.1.130
metadata_listen=192.168.1.126
ec2_listen=192.168.1.126
osapi_compute_listen=192.168.1.126
sql_connection=mysql://nova:123123@192.168.1.130/nova
rabbit_userid=openstack_rabbit_user
rabbit_password=openstack_rabbit_password
rabbit_ha_queues=True
rabbit_hosts=192.168.1.126:5672,192.168.1.127:5672
rabbit_virtual_host=/
memcached_servers=192.168.1.126:11211,192.168.1.127:11211
enabled_apis=ec2,osapi_compute,metadata
# 如果使用cinder,下面这项的port必须指定为8776以外的端口,因为cinder默认监听8776
osapi_volume_listen_port=5800
# 如果使用cinder下面这一项可有可无
osapi_volume_listen=192.168.1.126
修改/etc/nova/api-paste.ini
auth_host = 192.168.1.130
3.4.2、Memcache
修改/etc/memcached.conf
-l 192.168.1.126 #controller2改为192.168.1.127
Due to bug 1158958, Nova API must be patched to support memcached instead of the in-process cache.First, see if Nova needs to be patched by grep'ing the file that needs to be patched. You will receive no output if the file needs to be patched. You will receive host = str(instance.get('host')) if the file does not need patching:
# grep "str(instance.get('host'))" /usr/lib/python2.7/dist-packages/nova/api/openstack/compute/contrib/extended_availability_zone.py
If the extended_availability_zone.py file needs patching, download the patched file:
# wget https://raw.github.com/dflorea/nova/grizzly/nova/api/openstack/compute/contrib/extended_availability_zone.py
Copy the patched extended_availability_zone.py to the /usr/lib/python2.7/dist-packages/nova/api/openstack/compute/contrib/ directory:
# cp extended_availability_zone.py /usr/lib/python2.7/dist-packages/nova/api/openstack/compute/contrib/extended_availability_zone.py
Make sure the file is owned by root:root.
# ls -l /usr/lib/python2.7/dist-packages/nova/api/openstack/compute/contrib/extended_availability_zone.py
If extended_availability_zone.py is not owned by root, then change the file ownership:
# chmod root:root /usr/lib/python2.7/dist-packages/nova/api/openstack/compute/contrib/extended_availability_zone.py
Due to a bug similar to bug 1158958, Nova API must be patched to support memcached instead of the in-process cache. Edit /usr/lib/python2.7/dist-packages/nova/api/ec2/ec2utils.py by adding the following line after key = "%s:%s" % (func.__name__, reqid):
key = str(key)
3.4.3、Keystone
修改/etc/keystone/keystone.conf
bind_host = 192.168.1.126
connection = mysql://keystone:123123@192.168.1.130/keystone
修改/etc/keystone/default_catalog.templates
catalog.RegionOne.identity.publicURL = http://192.168.1.130:$(public_port)s/v2.0
catalog.RegionOne.identity.adminURL = http://192.168.1.130:$(admin_port)s/v2.0
catalog.RegionOne.identity.internalURL = http://192.168.1.130:$(public_port)s/v2.0
catalog.RegionOne.identity.name = 'Identity Service'
catalog.RegionOne.compute.publicURL = http://192.168.1.130:8774/v2/$(tenant_id)s
catalog.RegionOne.compute.adminURL = http://192.168.1.130:8774/v2/$(tenant_id)s
catalog.RegionOne.compute.internalURL = http://192.168.1.130:8774/v2/$(tenant_id)s
catalog.RegionOne.compute.name = 'Compute Service'
catalog.RegionOne.volume.publicURL = http://192.168.1.130:8776/v1/$(tenant_id)s
catalog.RegionOne.volume.adminURL = http://192.168.1.130:8776/v1/$(tenant_id)s
catalog.RegionOne.volume.internalURL = http://192.168.1.130:8776/v1/$(tenant_id)s
catalog.RegionOne.volume.name = 'Volume Service'
catalog.RegionOne.ec2.publicURL = http://192.168.1.130:8773/services/Cloud
catalog.RegionOne.ec2.adminURL = http://192.168.1.130:8773/services/Admin
catalog.RegionOne.ec2.internalURL = http://192.168.1.130:8773/services/Cloud
catalog.RegionOne.ec2.name = 'EC2 Service'
catalog.RegionOne.image.publicURL = http://192.168.1.130:9292/v1
catalog.RegionOne.image.adminURL = http://192.168.1.130:9292/v1
catalog.RegionOne.image.internalURL = http://192.168.1.130:9292/v1
catalog.RegionOne.image.name = 'Image Service'
catalog.RegionOne.network.publicURL = http://192.168.1.130:9696/
catalog.RegionOne.network.adminURL = http://192.168.1.130:9696/
catalog.RegionOne.network.internalURL = http://192.168.1.130:9696/
catalog.RegionOne.network.name = 'Quantum Service'
3.4.4、Glance
如果镜像使用本地存储,则两个glance的images必须保持一致;
修改/etc/glance/glance-api.conf
bind_host = 192.168.1.126
sql_connection = mysql://glance:123123@192.168.1.130/glance
registry_host = 192.168.1.130
auth_host = 192.168.1.130
修改/etc/glance/glance-registry.conf
bind_host = 192.168.1.126
sql_connection = mysql://glance:123123@192.168.1.130/glance
auth_host = 192.168.1.130
修改/etc/glance/glance-cache.conf
registry_host = 192.168.1.130
修改/etc/glance/glance-scrubber.conf
registry_host = 192.168.1.130
3.4.5、Horizon
修改/etc/openstack-dashboard/local_settings.py
OPENSTACK_HOST = "192.168.1.130"
3.4.6、Quantum
修改/etc/quantum/quantum.conf
bind_host = 192.168.1.126
rabbit_userid=openstack_rabbit_user
rabbit_password=openstack_rabbit_password
rabbit_ha_queues=True
rabbit_hosts=192.168.1.126:5672,192.168.1.127:5672
#rabbit_host = 192.168.1.130
修改/etc/quantum/api-paste.ini
auth_host = 192.168.1.130
修改/etc/quantum/plugins/openvswitch/ovs_quantum_plugin.ini
sql_connection = mysql://quantum:123123@192.168.1.130/quantum
3.4.7、Cinder
修改/etc/cinder/cinder.conf
rabbit_ha_queues=True
rabbit_host=192.168.1.126:5672,192.168.1.127:5672
sql_connection=mysql://cinder:123123@192.168.1.130/cinder
bind_host=192.168.1.126
osapi_volume_listen=192.168.1.126
修改/etc/cinder/api-paste.ini
service_host = 192.168.1.130
auth_host = 192.168.1.130
3.4.8、测试
修改openrc文件
export OS_AUTH_URL="http://192.168.1.130:5000/v2.0/"
export SERVICE_ENDPOINT="http://192.168.1.130:35357/v2.0/"
如果所有服务运行正常,断掉某个服务,openstack依然正常使用,则说明HA成功
openstack负载均衡及高可用配置
1、RabbitMQ集群配置
1.1、rabbitmq介绍
RabbitMQ是用erlang开发的,集群非常方便,因为erlang天生就是一门分布式语言。
RabbitMQ的集群节点包括内存节点、磁盘节点。顾名思义内存节点就是将所有数据放在内存,磁盘节点将数据放在磁盘。不过,如前文所述,如果在投递消息时,打开了消息的持久化,那么即使是内存节点,数据还是安全的放在磁盘。
良好的设计架构可以如下:在一个集群里,有3台机器,其中1台使用磁盘模式,另2台使用内存模式。2台内存模式的节点,无疑速度更快,因此客户端(consumer、producer)连接访问它们。而磁盘模式的节点,由于磁盘IO相对较慢,因此仅作数据备份使用。
1.2、rabbiqmq集群配置
配置RabbitMQ集群非常简单,就几个命令。2台测试机器,hostname分别是controller1和controller2,配置步骤如下:
在两台机的/etc/hosts文件里,指定controller1和controller2的hosts,如:
192.168.1.126 controller1
192.168.1.127 controller2
还有/etc/hostname文件也要正确,分别是controller1和controller2。
请注意RabbitMQ集群节点必须在同一个网段里,如果是跨广域网效果就差。
在两台机上都安装和启动RabbitMQ。
A newer version of RabbitMQ Server is required for the proper operation of clustering with OpenStack services,download rabbitmq-server-2.8.7
# wget -O /tmp/rabbitmq-server_2.8.7-1_all.deb http://www.rabbitmq.com/releases/rabbitmq-server/v2.8.7/rabbitmq-server_2.8.7-1_all.deb –no-check-certificate
安装依赖
# apt-get install -y erlang-nox
安装rabbitmq
# dpkg -i /tmp/rabbitmq-server_2.8.7-1_all.deb
配置rabbitmq集群:
首先需要停掉所有节点的rabbitmq服务
# service rabbitmq-server stop
保证所有节点有一样的Erlang cookie,将cookie从controller1拷贝至controller2(保证文件权限保持一致)
# scp /var/lib/rabbitmq/.erlang.cookie controller2:/var/lib/rabbitmq/.erlang.cookie
然后启动所有节点服务
# service rabbitmq-server start
在controller2节点上执行:
rabbitmqctl cluster rabbit@controller1
上述命令先停掉rabbitmq应用,reset集群状态,然后调用cluster命令,将controller2连接到controller1,使两者成为一个集群,最后重启rabbitmq应用。在这个cluster命令下,controller2是内存节点,controller1是磁盘节点(RabbitMQ启动后,默认是磁盘节点)。
如果要使controller2在集群里也是磁盘节点,那么更改上述第3句如下:
# rabbitmqctl cluster rabbit@controller1 rabbit@controller2
只要在节点列表里包含了自己,它就成为一个磁盘节点。在RabbitMQ集群里,必须至少有一个磁盘节点存在。
在controller2和controller1上,运行cluster_status命令查看集群状态:
root@controller1:~# rabbitmqctl cluster_status
Cluster status of node rabbit@controller1 ...
[{nodes,[{disc,[rabbit@controller1]},{ram,[rabbit@controller2]}]},
{running_nodes,[rabbit@controller2,rabbit@controller1]}]
root@controller2:~# rabbitmqctl cluster_status
Cluster status of node rabbit@controller2 ...
[{nodes,[{disc,[rabbit@controller1]},{ram,[rabbit@controller2]}]},
{running_nodes,[rabbit@controller1,rabbit@controller2]}]
我们看到,controller2和controller1都是磁盘节点(disc),并且都是在运行中的节点(running_nodes)。
往一台集群节点里写入消息队列,会复制到另一个节点上,我们看到两个节点的消息队列数一致:
root@controller2:~# rabbitmqctl list_queues -p /pyhtest
root@controller1:~# rabbitmqctl list_queues -p /pyhtes
这样RabbitMQ集群就正常工作了。
如果是3台配置集群,比如host1、host2、host3,那么只要保证host2与host1连通,host3与host1连通就行,host2与host3会自动连通,具体配置方法与2台是一致的。
创建openstack rabbitmq用户:
在其中一个节点上执行:
# rabbitmqctl delete_user guest
# rabbitmqctl add_user openstack_rabbit_user openstack_rabbit_password
# rabbitmqctl set_permissions -p / openstack_rabbit_user ".*" ".*" ".*"
确认用户:
# rabbitmqctl list_user_permissions openstack_rabbit_user
2、MySQL同步配置
2.1、修改mysql配置文件
将mysql server自己建立的数据库先导出,然后删除;
两台mysql都需要开启binlog日志功能,两台mysql的server-id不能一致,一台server-id=1,一台server-id=2,修改/etc/mysql/my.cnf在[mysqld]段加入
服务器192.168.1.126:
log-bin=mysql-bin
server-id=1
binlog-ignore-db=mysql
#主主需加入的部分
log-slave-updates
sync_binlog=1
auto_increment_offset=1
auto_increment_increment=2
replicate-ignore-db = mysql,information_schema
服务器192.168.1.127:
log-bin=mysql-bin
server-id=2
replicate-ignore-db = mysql,information_schema
#主主需要加入部分
binlog-ignore-db=mysql
log-slave-updates
sync_binlog=1
auto_increment_offset=2
auto_increment_increment=2
重启服务
# service mysql restart
2.2、将192.168.1.126设为192.168.1.127的主服务器
在192.168.1.126上新建授权用户
mysql> grant replication slave on *.* to 'replication'@'%' identified by 'replication';
Query OK, 0 rows affected (0.00 sec)
mysql> flush privileges;
mysql> flush tables with read lock;
mysql> show master status\G
*************************** 1. row ***************************
File: mysql-bin.000002
Position: 4049
Binlog_Do_DB:
Binlog_Ignore_DB: mysql
1 row in set (0.00 sec)
mysql> unlock tables;
Query OK, 0 rows affected (0.00 sec)
在192.168.1.127上将192.168.1.126设置为自己的主服务器
mysql> change master to master_host='192.168.1.126',master_user='replication',master_password='replication',master_log_file='mysql-bin.000002',master_log_pos=4049;
Query OK, 0 rows affected (0.23 sec)
mysql> start slave;
Query OK, 0 rows affected (0.00 sec)
mysql> show slave status\G
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
以上两项都是Yes则表示状态正常;
同样在192.168.1.127上新建授权用户,并且在192.168.1.126上将192.168.1.127设置为自己的主服务器
MySQL同步检测脚本(脚本检测同步sql和IO进程是否都为真,以及select是否达到20个进程以上)
#!/bin/bash
#
# /usr/local/bin/mysqlchk_status.sh
# This script checks if a mysql server is healthy running on localhost. It will
# return:
# "HTTP/1.x 200 OK\r" (if mysql is running smoothly)
# – OR –
# "HTTP/1.x 503 Internal Server Error\r" (else)
#
MYSQL_HOST="localhost"
MYSQL_PORT="3306"
MYSQL_USERNAME="root"
MYSQL_PASSWORD="123123"
#
# We perform a simple query that should return a few results
#/usr/local/mysql/bin/mysql -hlocalhost –urepdb63 –pmylqs9eyex7s -e "show slave status\G;" > /tmp/rep.txt
mysql -u$MYSQL_USERNAME -p$MYSQL_PASSWORD -e "show full processlist;" >/tmp/processlist.txt
mysql -u$MYSQL_USERNAME -p$MYSQL_PASSWORD -e "show slave status\G;" >/tmp/rep.txt
iostat=`grep "Slave_IO_Running" /tmp/rep.txt |awk '{print $2}'`
sqlstat=`grep "Slave_SQL_Running" /tmp/rep.txt |awk '{print $2}'`
result=$(cat /tmp/processlist.txt|wc -l)
#echo iostat:$iostat and sqlstat:$sqlstat
# if slave_IO_Running and Slave_sql_Running ok,then return 200 code
if [ "$result" -lt "20" ] && [ "$iostat" = "Yes" ] && [ "$sqlstat" = "Yes" ];
then
# mysql is fine, return http 200
/bin/echo -e "HTTP/1.1 200 OK\r\n"
else
# mysql is down, return http 503
/bin/echo -e "HTTP/1.1 503 Service Unavailable\r\n"
fi
3、openstack负载均衡配置
这里我们使用HAProxy + Keepalived实现服务的HA load balancing,当然也可以使用其他方案如:HAProxy + Pacemaker + Corosync …
主要针对openstack REST API服务:nova-api,keystone,glance-api,glance-registry,quantum-server,cinder-api, memcache以及rabbitmq及mysql服务做HA&loadbalancing,如果还需要其他服务,你懂的!依葫芦画瓢!
还是使用两台机器,A:192.168.1.126,B:192.168.1.127,虚拟浮动IP:192.168.1.130,所有的IP都设置在eth0上;
3.1、安装必要的软件包
# sudo apt-get install haproxy keepalived
3.2、配置haproxy
haproxy的配置文件为/etc/haproxy/haproxy.cfg
192.168.1.126:
# mkdir /var/lib/haproxy
global
chroot /var/lib/haproxy
daemon
nbproc 8
group haproxy
log 192.168.1.126 local0
maxconn 32768
pidfile /var/run/haproxy.pid
stats socket /var/lib/haproxy/stats
user haproxy
defaults
log global
maxconn 32768
mode http
option redispatch
retries 3
stats enable
timeout http-request 10s
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout check 10s
listen keystone-1 192.168.1.130:5000
balance source
option tcplog
server controller-1 192.168.1.126:5000 check inter 2000 rise 2 fall 5
server controller-2 192.168.1.127:5000 check inter 2000 rise 2 fall 5
listen keystone-2 192.168.1.130:35357
balance source
option tcplog
server controller-1 192.168.1.126:35357 check inter 2000 rise 2 fall 5
server controller-2 192.168.1.127:35357 check inter 2000 rise 2 fall 5
listen nova-api-1 192.168.1.130:8773
balance source
option tcplog
server controller-1 192.168.1.126:8773 check inter 2000 rise 2 fall 5
server controller-2 192.168.1.127:8773 check inter 2000 rise 2 fall 5
listen nova-api-2 192.168.1.130:8774
balance source
option tcplog
server controller-1 192.168.1.126:8774 check inter 2000 rise 2 fall 5
server controller-2 192.168.1.127:8774 check inter 2000 rise 2 fall 5
listen nova-api-3 192.168.1.130:8775
balance source
option tcplog
server controller-1 192.168.1.126:8775 check inter 2000 rise 2 fall 5
server controller-2 192.168.1.127:8775 check inter 2000 rise 2 fall 5
listen cinder-api 192.168.1.130:8776
balance source
option tcplog
server controller1 192.168.1.126:8776 check inter 2000 rise 2 fall 5
server hyson 192.168.1.127:8776 check inter 2000 rise 2 fall 5
listen glance-api 192.168.1.130:9292
balance source
option tcplog
server controller-1 192.168.1.126:9292 check inter 2000 rise 2 fall 5
server controller-2 192.168.1.127:9292 check inter 2000 rise 2 fall 5
listen glance-registry 192.168.1.130:9191
balance source
option tcplog
server controller-1 192.168.1.126:9191 check inter 2000 rise 2 fall 5
server controller-2 192.168.1.127:9191 check inter 2000 rise 2 fall 5
listen quantum-server 192.168.1.130:9696
balance source
option tcplog
server controller-1 192.168.1.126:9696 check inter 2000 rise 2 fall 5
server controller-2 192.168.1.127:9696 check inter 2000 rise 2 fall 5
listen swift-proxy 192.168.1.130:8081
balance source
option tcplog
server controller-1 192.168.1.126:8081 check inter 2000 rise 2 fall 5
server controller-2 192.168.1.127:8081 check inter 2000 rise 2 fall 5
listen memcache 192.168.1.130:11211
balance source
option tcplog
server controller-1 192.168.1.126:11211 check inter 2000 rise 2 fall 5
server controller-2 192.168.1.127:11211 check inter 2000 rise 2 fall 5
拷贝/etc/haproxy/haproxy.cfg到192.168.1.127的相同目录修改
log 192.168.1.127 local0
修改/etc/default/haproxy
ENABLE=1
重启服务
# service haproxy restart
3.3、配置keepalived
在两个节点上新建/etc/keepalived/keepalived.conf配置文件
global_defs {
router_id controller-1
}
vrrp_script haproxy {
script “killall -0 haproxy”
interval 2
weight 2
}
vrrp_instance 50 {
virtual_router_id 50
# for electing MASTER, highest priority wins.
priority 101
state MASTER
interface eth0
virtual_ipaddress {
192.168.1.130
}
track_script {
haproxy
}
}
拷贝/etc/keepalived/keepalived.conf到192.168.1.127的相同目录,修改
router_id controller-2
priority 100
重启服务
# service keepalived restart
3.4、配置openstack服务
以下配置192.168.126节点上修改,在192.168.1.127上执行同样的操作
3.4.1、Nova
修改/etc/nova/nova.conf
glance_api_servers=192.168.1.130:9292
quantum_url=http://192.168.1.130:9696
quantum_admin_auth_url=http://192.168.1.130:35357/v2.0
quantum_connection_host=192.168.1.130
metadata_listen=192.168.1.126
ec2_listen=192.168.1.126
osapi_compute_listen=192.168.1.126
sql_connection=mysql://nova:123123@192.168.1.130/nova
rabbit_userid=openstack_rabbit_user
rabbit_password=openstack_rabbit_password
rabbit_ha_queues=True
rabbit_hosts=192.168.1.126:5672,192.168.1.127:5672
rabbit_virtual_host=/
memcached_servers=192.168.1.126:11211,192.168.1.127:11211
enabled_apis=ec2,osapi_compute,metadata
# 如果使用cinder,下面这项的port必须指定为8776以外的端口,因为cinder默认监听8776
osapi_volume_listen_port=5800
# 如果使用cinder下面这一项可有可无
osapi_volume_listen=192.168.1.126
修改/etc/nova/api-paste.ini
auth_host = 192.168.1.130
3.4.2、Memcache
修改/etc/memcached.conf
-l 192.168.1.126 #controller2改为192.168.1.127
Due to bug 1158958, Nova API must be patched to support memcached instead of the in-process cache.First, see if Nova needs to be patched by grep'ing the file that needs to be patched. You will receive no output if the file needs to be patched. You will receive host = str(instance.get('host')) if the file does not need patching:
# grep "str(instance.get('host'))" /usr/lib/python2.7/dist-packages/nova/api/openstack/compute/contrib/extended_availability_zone.py
If the extended_availability_zone.py file needs patching, download the patched file:
# wget https://raw.github.com/dflorea/nova/grizzly/nova/api/openstack/compute/contrib/extended_availability_zone.py
Copy the patched extended_availability_zone.py to the /usr/lib/python2.7/dist-packages/nova/api/openstack/compute/contrib/ directory:
# cp extended_availability_zone.py /usr/lib/python2.7/dist-packages/nova/api/openstack/compute/contrib/extended_availability_zone.py
Make sure the file is owned by root:root.
# ls -l /usr/lib/python2.7/dist-packages/nova/api/openstack/compute/contrib/extended_availability_zone.py
If extended_availability_zone.py is not owned by root, then change the file ownership:
# chmod root:root /usr/lib/python2.7/dist-packages/nova/api/openstack/compute/contrib/extended_availability_zone.py
Due to a bug similar to bug 1158958, Nova API must be patched to support memcached instead of the in-process cache. Edit /usr/lib/python2.7/dist-packages/nova/api/ec2/ec2utils.py by adding the following line after key = "%s:%s" % (func.__name__, reqid):
key = str(key)
3.4.3、Keystone
修改/etc/keystone/keystone.conf
bind_host = 192.168.1.126
connection = mysql://keystone:123123@192.168.1.130/keystone
修改/etc/keystone/default_catalog.templates
catalog.RegionOne.identity.publicURL = http://192.168.1.130:$(public_port)s/v2.0
catalog.RegionOne.identity.adminURL = http://192.168.1.130:$(admin_port)s/v2.0
catalog.RegionOne.identity.internalURL = http://192.168.1.130:$(public_port)s/v2.0
catalog.RegionOne.identity.name = 'Identity Service'
catalog.RegionOne.compute.publicURL = http://192.168.1.130:8774/v2/$(tenant_id)s
catalog.RegionOne.compute.adminURL = http://192.168.1.130:8774/v2/$(tenant_id)s
catalog.RegionOne.compute.internalURL = http://192.168.1.130:8774/v2/$(tenant_id)s
catalog.RegionOne.compute.name = 'Compute Service'
catalog.RegionOne.volume.publicURL = http://192.168.1.130:8776/v1/$(tenant_id)s
catalog.RegionOne.volume.adminURL = http://192.168.1.130:8776/v1/$(tenant_id)s
catalog.RegionOne.volume.internalURL = http://192.168.1.130:8776/v1/$(tenant_id)s
catalog.RegionOne.volume.name = 'Volume Service'
catalog.RegionOne.ec2.publicURL = http://192.168.1.130:8773/services/Cloud
catalog.RegionOne.ec2.adminURL = http://192.168.1.130:8773/services/Admin
catalog.RegionOne.ec2.internalURL = http://192.168.1.130:8773/services/Cloud
catalog.RegionOne.ec2.name = 'EC2 Service'
catalog.RegionOne.image.publicURL = http://192.168.1.130:9292/v1
catalog.RegionOne.image.adminURL = http://192.168.1.130:9292/v1
catalog.RegionOne.image.internalURL = http://192.168.1.130:9292/v1
catalog.RegionOne.image.name = 'Image Service'
catalog.RegionOne.network.publicURL = http://192.168.1.130:9696/
catalog.RegionOne.network.adminURL = http://192.168.1.130:9696/
catalog.RegionOne.network.internalURL = http://192.168.1.130:9696/
catalog.RegionOne.network.name = 'Quantum Service'
3.4.4、Glance
如果镜像使用本地存储,则两个glance的images必须保持一致;
修改/etc/glance/glance-api.conf
bind_host = 192.168.1.126
sql_connection = mysql://glance:123123@192.168.1.130/glance
registry_host = 192.168.1.130
auth_host = 192.168.1.130
修改/etc/glance/glance-registry.conf
bind_host = 192.168.1.126
sql_connection = mysql://glance:123123@192.168.1.130/glance
auth_host = 192.168.1.130
修改/etc/glance/glance-cache.conf
registry_host = 192.168.1.130
修改/etc/glance/glance-scrubber.conf
registry_host = 192.168.1.130
3.4.5、Horizon
修改/etc/openstack-dashboard/local_settings.py
OPENSTACK_HOST = "192.168.1.130"
3.4.6、Quantum
修改/etc/quantum/quantum.conf
bind_host = 192.168.1.126
rabbit_userid=openstack_rabbit_user
rabbit_password=openstack_rabbit_password
rabbit_ha_queues=True
rabbit_hosts=192.168.1.126:5672,192.168.1.127:5672
#rabbit_host = 192.168.1.130
修改/etc/quantum/api-paste.ini
auth_host = 192.168.1.130
修改/etc/quantum/plugins/openvswitch/ovs_quantum_plugin.ini
sql_connection = mysql://quantum:123123@192.168.1.130/quantum
3.4.7、Cinder
修改/etc/cinder/cinder.conf
rabbit_ha_queues=True
rabbit_host=192.168.1.126:5672,192.168.1.127:5672
sql_connection=mysql://cinder:123123@192.168.1.130/cinder
bind_host=192.168.1.126
osapi_volume_listen=192.168.1.126
修改/etc/cinder/api-paste.ini
service_host = 192.168.1.130
auth_host = 192.168.1.130
3.4.8、测试
修改openrc文件
export OS_AUTH_URL="http://192.168.1.130:5000/v2.0/"
export SERVICE_ENDPOINT="http://192.168.1.130:35357/v2.0/"
如果所有服务运行正常,断掉某个服务,openstack依然正常使用,则说明HA成功