三、RHCS的corosync组件实现HA(高可用)。
1、pacemaker作为corosync的插件运行
搭建环境:
ms.dtedu.com:管理HA的站点(ansible)
node5.dtedu.com:高可用节点1
node6.dtedu.com:高可用节点2
资源:vip+web+fielsystem
前提条件:
1、时间同步
2、dns解析
3、ssh互信
4、关闭iptables
5、关闭selinux
注意:运行networkmanager组件,将不能把此节点放在集群中
1.1安装ansible软件,然后安装corosync、pecemaker,安装pacemaker不能安装heartbeat,安装不上注意仓库的选择问题。
[root@ms.dtedu.com~]$ansible all -a "yum install -y corosync pacemaker"
node5.dtedu.com | SUCCESS | rc=0 >>
已加载插件:fastestmirror, refresh-packagekit
设置安装进程
Loading mirror speeds from cached hostfile
* base: mirror.bit.edu.cn
* epel: mirrors.tuna.tsinghua.edu.cn
* extras: mirrors.tuna.tsinghua.edu.cn
* updates: mirrors.tuna.tsinghua.edu.cn
依赖关系解决
================================================================================
软件包 架构 版本 仓库 大小
================================================================================
正在安装:
pacemaker x86_64 1.1.15-5.el6 base 443 k
为依赖而安装:
cifs-utils x86_64 4.8.1-20.el6 base 65 k
clusterlib x86_64 3.0.12.1-84.el6 base 109 k
cman x86_64 3.0.12.1-84.el6 base 454 k
cyrus-sasl-md5 x86_64 2.1.23-15.el6_6.2 base 47 k
fence-agents x86_64 4.0.15-13.el6 base 193 k
fence-virt x86_64 0.2.3-24.el6 base 39 k
gnutls-utils x86_64 2.12.23-21.el6 base 109 k
ipmitool x86_64 1.8.15-2.el6 base 465 k
libtasn1-devel x86_64 2.3-6.el6_5 base 61 k
libvirt-client x86_64 0.10.2-62.el6 base 4.1 M
modcluster x86_64 0.16.2-35.el6 base 210 k
nc x86_64 1.84-24.el6 base 57 k
net-snmp-utils x86_64 1:5.5-60.el6 base 177 k
numactl x86_64 2.0.9-2.el6 base 74 k
oddjob x86_64 0.30-6.el6 base 60 k
openais x86_64 1.1.1-7.el6 base 192 k
openaislib x86_64 1.1.1-7.el6 base 82 k
pacemaker-cli x86_64 1.1.15-5.el6 base 291 k
pacemaker-cluster-libs x86_64 1.1.15-5.el6 base 85 k
pacemaker-libs x86_64 1.1.15-5.el6 base 483 k
perl-Net-Telnet noarch 3.03-11.el6 base 56 k
pexpect noarch 2.3-6.el6 base 147 k
pyOpenSSL x86_64 0.13.1-2.el6 base 263 k
python-suds noarch 0.4.1-3.el6 base 218 k
quota x86_64 1:3.17-23.el6 base 202 k
resource-agents x86_64 3.9.5-46.el6 base 389 k
ricci x86_64 0.16.2-87.el6 base 633 k
sg3_utils x86_64 1.28-12.el6 base 498 k
tcp_wrappers x86_64 7.6-58.el6 base 70 k
yajl x86_64 1.0.7-3.el6 base 27 k
为依赖而更新:
gnutls x86_64 2.12.23-21.el6 base 389 k
gnutls-devel x86_64 2.12.23-21.el6 base 1.2 M
net-snmp-devel x86_64 1:5.5-60.el6 base 307 k
net-snmp-libs x86_64 1:5.5-60.el6 base 1.5 M
nspr x86_64 4.13.1-1.el6 base 114 k
nss x86_64 3.27.1-13.el6 base 873 k
nss-sysinit x86_64 3.27.1-13.el6 base 50 k
nss-tools x86_64 3.27.1-13.el6 base 443 k
nss-util
软件包 架构 版本 仓库 大小
================================================================================
正在安装:
corosync x86_64 1.4.7-5.el6 base 216 k
为依赖而安装:
corosynclib x86_64 1.4.7-5.el6 base 194 k
x86_64 3.27.1-3.el6 base 68 k
1.2安装crmsh,pssh软件包。crmsh依赖于pssh。
[root@ms.dtedu.com~]$ansible all -a "chdir=/etc/yum.repos.d wget http://download.opensuse.org/repositories/network:ha-clustering:Stable/RedHat_RHEL-6/network:ha-clustering:Stable.repo"
node6.dtedu.com | SUCCESS | rc=0 >>
--2017-04-10 06:31:50-- http://download.opensuse.org/repositories/network:ha-clustering:Stable/RedHat_RHEL-6/network:ha-clustering:Stable.repo
正在解析主机 download.opensuse.org... 195.135.221.134, 2001:67c:2178:8::13
正在连接 download.opensuse.org|195.135.221.134|:80... 已连接。
已发出 HTTP 请求,正在等待回应... 301 Moved Permanently
位置:http://download.opensuse.org/repositories/network:ha-clustering:/Stable/RedHat_RHEL-6/network:ha-clustering:Stable.repo [跟随至新的 URL]
--2017-04-10 06:31:51-- http://download.opensuse.org/repositories/network:ha-clustering:/Stable/RedHat_RHEL-6/network:ha-clustering:Stable.repo
再次使用存在的到 download.opensuse.org:80 的连接。
已发出 HTTP 请求,正在等待回应... 301 Moved Permanently
位置:http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/RedHat_RHEL-6/network:ha-clustering:Stable.repo [跟随至新的 URL]
--2017-04-10 06:31:51-- http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/RedHat_RHEL-6/network:ha-clustering:Stable.repo
再次使用存在的到 download.opensuse.org:80 的连接。
已发出 HTTP 请求,正在等待回应... 200 OK
长度:345 [text/plain]
正在保存至: “network:ha-clustering:Stable.repo”
0K 100% 28.4M=0s
[root@ms.dtedu.com~]$ansible all -a "yum -y install crmsh"
1.3配置文件讲解(/etc/corosync.conf.example)
sync]# cat corosync.conf |grep -v ^# |grep -v ^$
compatibility: whitetank//是否兼容whitetank版本,就是0.8版本,兼容的话新功能不能使用。
totem {//心跳线配置模块
version: 2
# secauth: Enable mutual node authentication. If you choose to
# enable this ("on"), then do remember to create a shared
# secret with "corosync-keygen".
secauth: off//是否启用安全认证模式
threads: 0//启动线程数量
# interface: define at least one interface to communicate
# over. If you define more than one interface stanza, you must
# also set rrp_mode.
interface {//定义心跳信息传递接口
# Rings must be consecutively numbered, starting at 0.
ringnumber: 0//信息循环次数
# This is normally the *network* address of the
# interface to bind to. This ensures that you can use
# identical instances of this configuration file
# across all your cluster nodes, without having to
# modify this option.
bindnetaddr: 192.168.1.0//绑定的网络地址,用于心跳线的网卡ip地址。
# However, if you have multiple physical network
# interfaces configured for the same subnet, then the
# network address alone is not sufficient to identify
# the interface Corosync should bind to. In that case,
# configure the *host* address of the interface
# instead:
# bindnetaddr: 192.168.1.1
# When selecting a multicast address, consider RFC
# 2365 (which, among other things, specifies that
# 239.255.x.x addresses are left to the discretion of
# the network administrator). Do not reuse multicast
# addresses across multiple Corosync clusters sharing
# the same network.
mcastaddr: 224.5.5.5//组播地址
# Corosync uses the port you specify here for UDP
# messaging, and also the immediately preceding
# port. Thus if you set this to 5405, Corosync sends
# messages over UDP ports 5405 and 5404.
mcastport: 5405//组播端口
# Time-to-live for cluster communication packets. The
# number of hops (routers) that this ring will allow
# itself to pass. Note that multicast routing must be
# specifically enabled on most network routers.
ttl: 1
}
}
logging {//定义日志信息
# Log the source file and line where messages are being
# generated. When in doubt, leave off. Potentially useful for
# debugging.
fileline: off
# Log to standard error. When in doubt, set to no. Useful when
# running in the foreground (when invoking "corosync -f")
to_stderr: no
# Log to a log file. When set to "no", the "logfile" option
# must not be set.
to_logfile: yes
logfile: /var/log/cluster/corosync.log
# Log to the system log daemon. When in doubt, set to yes.
to_syslog: yes//是否将日志信息写入的/var/log/message中,建议no
# Log debug messages (very verbose). When in doubt, leave off.
debug: off
# Log messages with time stamps. When in doubt, set to on
# (unless you are only logging to syslog, where double
# timestamps can be annoying).
timestamp: on//是否打开时间戳,可以关闭
logger_subsys {
subsys: AMF
debug: off
}
}
Service {//以模块方式运行pecemaker
ver:0
name:pacemaker
}
1.4 制作corosync通信间的安全秘钥。将authkey、corosync.cnf复制到其他节点上。
[root@node5.dtedu.com /etc/corosync]# corosync-keygen
Corosync Cluster Engine Authentication key generator.
Gathering 1024 bits for key from /dev/random.
Press keys on your keyboard to generate entropy.
Writing corosync key to /etc/corosync/authkey.
[root@node5.dtedu.com /etc/corosync]# scp authkey corosync.conf node6:/etc/corosync/
authkey 100% 128 0.1KB/s 00:00
corosync.conf 100% 2663 2.6KB/s 00:00
[root@node5.dtedu.com /etc/corosync]#
1.5关闭节点上的NetworkManager服务
[root@ms.dtedu.com~]$ansible all -a "chkconfig NetworkManager off"
node5.dtedu.com | SUCCESS | rc=0 >>
node6.dtedu.com | SUCCESS | rc=0 >>
[root@ms.dtedu.com~]$ansible all -a "service NetworkManager stop"
node5.dtedu.com | SUCCESS | rc=0 >>
Stopping NetworkManager daemon: [FAILED]
node6.dtedu.com | SUCCESS | rc=0 >>
Stopping NetworkManager daemon: [ OK ]
1.6启动corosync服务
[root@ms.dtedu.com~]$ansible all -a "service corosync start"
node6.dtedu.com | SUCCESS | rc=0 >>
Starting Corosync Cluster Engine (corosync): [ OK ]
node5.dtedu.com | SUCCESS | rc=0 >>
Starting Corosync Cluster Engine (corosync): [ OK ]
1.7检查服务启动情况。
检查corosync引擎是否正常启动
[root@node5.dtedu.com /etc/corosync]# grep -e "Corosync Cluster Engine" -e "configuration file" /var/log/cluster/corosync.log
Apr 10 10:14:23 corosync [MAIN ] Corosync Cluster Engine ('1.4.7'): started and ready to provide service.
Apr 10 10:14:23 corosync [MAIN ] Successfully read main configuration file '/etc/corosync/corosync.conf'.
查看初始化成员节点通知是否正常发送
[root@node5.dtedu.com /etc/corosync]# grep TOTEM /var/log/cluster/corosync.log
Apr 10 10:14:23 corosync [TOTEM ] Initializing transport (UDP/IP Multicast).
Apr 10 10:14:23 corosync [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
Apr 10 10:14:24 corosync [TOTEM ] The network interface [192.168.1.23] is now up.
Apr 10 10:14:24 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.
Apr 10 10:14:24 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.
检查启动过程中是否有错误产生,针对资源错误进行检查
[root@node5.dtedu.com /etc/corosync]# grep ERROR: /var/log/cluster/corosync.log |grep -v unpack_resources
检查pacemaker是否正常启动
[root@node5.dtedu.com /etc/yum.repos.d]# grep pcmk_startup /var/log/cluster/corosync.log
Apr 10 13:17:19 corosync [pcmk ] info: pcmk_startup: CRM: Initialized
Apr 10 13:17:19 corosync [pcmk ] Logging: Initialized pcmk_startup
Apr 10 13:17:19 corosync [pcmk ] info: pcmk_startup: Maximum core file size is: 18446744073709551615
Apr 10 13:17:19 corosync [pcmk ] info: pcmk_startup: Service: 9
Apr 10 13:17:19 corosync [pcmk ] info: pcmk_startup: Local hostname: node5.dtedu.com
查看高可用节点间的状态
[root@node6.dtedu.com /etc/yum.repos.d]# service corosync start
Starting Corosync Cluster Engine (corosync): [ OK ]
[root@node6.dtedu.com /etc/yum.repos.d]# crm status
Stack: classic openais (with plugin)
Current DC: node5.dtedu.com (version 1.1.15-5.el6-e174ec8) - partition with quorum
Last updated: Mon Apr 10 13:47:46 2017Last change: Mon Apr 10 13:47:38 2017 by hacluster via crmd on node5.dtedu.com
, 2 expected votes
2 nodes and 0 resources configured
Online: [ node5.dtedu.com node6.dtedu.com ]
No resources
用来检查corosync是否有语法错误
[root@node5.dtedu.com /etc/yum.repos.d]# crm_verify -LV
error: unpack_resources:Resource start-up disabled since no STONITH resources have been defined
error: unpack_resources:Either configure some or disable STONITH with the stonith-enabled option
error: unpack_resources:NOTE: Clusters with shared data need STONITH to ensure data integrity
Errors found during check: config not valid