思路:使用pacemaker原生的CLI管理工具组件的crm_mon,监控集群的状态
梗概:把crm_mon做成LSB标准的服务。使用crm configure工具添加到集群资源中。设置Apache资源与crm_mon资源的启动顺序和绑定关系。
操作:
一、制作标准的LSB 架构crm_mon服务以方便pacemaker管理,拷贝如下脚本到集群节点的/etc/init.d/目录下并给执行权限
注:可以根据实际情况,修改crm_mon的参数,例如我们还可以使用
crm_mon --daemonize --snmp-traps snmptrapd.example.com
来发出SNMP信息,并通过nagios等SNMP代理收集并处理。
- #!/bin/bash
- #
- # crm_mon Startup script for the Pacemaker managment tools: crm_mon
- #
- # chkconfig: - 86 16
- # description: crm_mon scripts for pacemaker's lsb style
- # processname: crm_mon
- # pidfile: /var/run/crm_mon.pid
- #
- ### BEGIN INIT INFO
- ### END INIT INFO
- # wangxiaoyu#live.com
- # 2011-03-20 22:53:59
- # Version 0.1
- # Source function library.
- . /etc/rc.d/init.d/functions
- crm_mon=/usr/sbin/crm_mon
- prog=crm_mon
- pidfile=${PIDFILE-/var/run/crm_mon.pid}
- lockfile=${LOCKFILE-/var/lock/subsys/crm_mon}
- RETVAL=0
- # 请根据实际情况,修改crm_mon的启动参数
- OPTIONS='--daemonize --as-html /var/www/html/crm_mon.html'
- start() {
- echo -ne $"Starting $prog: "
- $crm_mon $OPTIONS
- RETVAL=$?
- [ $RETVAL -ne 0 ] && echo_failure && return $RETVAL
- pidNum=$(pidof ${crm_mon})
- [ "$pidNum" -ne 1 ] && touch ${pidfile} \
- && echo -ne ${pidNum} > ${pidfile} \
- && touch ${lockfile}
- RETVAL=$?
- [ $RETVAL = 0 ] && echo_success
- return $RETVAL
# 最初尝试通过function中的deamon函数实现,发现不成功,只好自己写pidfile和lockfile- }
- # When stopping crm_mon a delay of >10 second is required before SIGKILLing the
- # crm_mon parent; this gives enough time for the crm_mon parent to SIGKILL any
- # errant children.
- stop() {
- echo -n $"Stopping $prog: "
- killproc -p ${pidfile} -d 10 $crm_mon
- RETVAL=$?
- echo
- [ $RETVAL = 0 ] && rm -f ${lockfile} ${pidfile}
- }
- reload() {
- echo -n $"Reloading $prog: "
- # Force LSB behaviour from killproc
- LSB=1 killproc -p ${pidfile} $crm_mon -HUP
- RETVAL=$?
- if [ $RETVAL -eq 7 ]; then
- failure "$crm_mon shutdown"
- fi
- echo
- }
- # See how we were called.
- case "$1" in
- start)
- start
- ;;
- stop)
- stop
- ;;
- status)
- status -p ${pidfile} $crm_mon
- RETVAL=$?
- ;;
- restart)
- stop
- start
- ;;
- condrestart|try-restart)
- if status -p ${pidfile} $crm_mon >&/dev/null; then
- stop
- start
- fi
- ;;
- force-reload|reload)
- reload
- ;;
- *)
- echo $"Usage: $prog {start|stop|restart|condrestart|try-restart|force-reload|reload|status|help}"
- RETVAL=2
- esac
- exit $RETVA
二、添加crm_mon资源到集群中,并设置与资源HTTPD(管理Apache服务的脚本)的绑定关系和启动顺序
在任意一集群节点上操作:
crm configure primitive CRM_MON lsb:crm_mon
crm configure colocation CRM_MON_with_HTTPD inf: CRM_MON HTTPD
crm configure order order HTTPD_before_CRM_MON inf: HTTPD CRM_MON
三、现在我们就可以通过http://VIP/crm_mon.html页面实时查看集群的状态了
# elinks http://VIP/crm_mon.html # 结果如下
Cluster summary
Last updated: Sun Mar 20 22:32:38 2011
Current DC: pcmk-2 (pcmk-2)
3 Nodes configured.
3 Resources configured.
Config Options
STONITH of failed nodes | : | disabled |
Cluster is | : | symmetric |
No Quorum Policy | : | Ignore |
Node List
- Node: pcmk-2 (pcmk-2): online
- Node: pcmk-1 (pcmk-1): standby
- Node: pcmk-3 (pcmk-3): online
Resource List
VIP (ocf::heartbeat:IPaddr2): Started pcmk-2
HTTPD (ocf::heartbeat:apache): Started pcmk-2
CRM_MON (lsb:crm_mon): Started pcmk-2