线上采用DRBD+Heartbeat+MySQL的方式部署MySQL高可用架构,所以对DRBD的监控也很重要。
一 监控原理
1.使用drbd-overview
$ drbd-overview 0:??not-found?? Connected Primary/Secondary UpToDate/UpToDate C r----- /database ext4 50G 3.7G 44G 8%
如果不是root权限,将不会看到resource 名称。
$ sudo drbd-overview 0:r0 Connected Primary/Secondary UpToDate/UpToDate C r----- /database ext4 50G 3.7G 44G 8%
2.查看/proc/drbd
$ cat /proc/drbd version: 8.3.16 (api:88/proto:86-97) GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by phil@Build64R6, 2013-09-27 16:00:43 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----- ns:1200291012 nr:7644 dw:1200298728 dr:1575405 al:19036 bm:13 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
C 这个位置表示同步协议是协议C , 可以是B 或 A
I/O 状态标志,共有6个标志位,表示关于这个资源的I/O操作状态信息
r-----
1.I/O suspension 。要么是r表示正在运行,要么是s表示暂停
2.Serial resynchronization。通常情况下是-
3.Peer-initiated sync suspension. 通常情况下是-
4.Peer-initiated sync suspension. 通常情况下是-
5.Locally blocked I/O。通常情况下是-
6.Activity Log update suspension. 通常情况下是-
cs Connetction State 显示定义resource的连接状态
可以有以下几种连接状态:
StandAlone
Disconnecting
Unconnected
BrokenPipe
NetworkFailure
Connected 正常状态
等等
ds disk states 显示磁盘状态
先显示本地磁盘状态,然后再显示远程主机磁盘状态,它们都可能是以下几种状态:
Diskless
Attaching
Failed
Negotiating
Inconsistent
Outdated
DUnknown
Consistent
UpToDate 这个状态表示数据同步一致,是正常状态
ro 资源角色类型
Primary 可读可写
Secondary 不可读不可写
Unknown 这个状态只发生在远端主机
ns network send 发送的数据量,以KBytes表示
nr network received 接收的数据量,以KBytes表示
dw disk write 写入到本地磁盘的数据量,以KBytes表示
dr disk read 从本地读取的数据量,以KBytes表示
al activity log DRBD元数据中活动日志位置更新次数
bm bitmap DRBD元数据中bitmap位置更新次数
lo local count 本地I/O子系统有关DRBD的请求数量
pe pending 已经发送到对端但是还没有得到响应的请求数量
ua unacknowledged 对端通过网络接收到的请求数量,但是它们还没有被答复
ap application pending Number of block I/O requests forwarded to DRBD, but not yet answered by DRBD.
ep
(epochs). Number of epoch objects. Usually 1. Might increase under I/O load when using either the barrier
or the none
write ordering method.
wo
(write order). Currently used write ordering method: b
(barrier),f
(flush), d
(drain) or n
(none).
oos
(out of sync). Amount of storage currently out of sync; in Kibibytes.
3.使用service drbd status查看
$ sudo service drbd status drbd driver loaded OK; device status: version: 8.3.16 (api:88/proto:86-97) GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by phil@Build64R6, 2013-09-27 16:00:43 m:res cs ro ds p mounted fstype 0:r0 Connected Primary/Secondary UpToDate/UpToDate C /database ext4
二 监控脚本编写
一般情况下,在生产服务器上只需定义一个resource,便于维护。所以,这里讨论只有一个DRBD resource的监控方法,如果有多个resource可以通过Zabbix的自动发现功能。
drbd_status.sh
#!/bin/bash #gather drbd status via /proc/drbd #$ cat /proc/drbd #version: 8.3.16 (api:88/proto:86-97) #GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by phil@Build64R6, 2013-09-27 16:00:43 # 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----- # ns:1202588332 nr:7644 dw:1202596048 dr:1575405 al:19216 bm:13 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0 #assume only one resource defined,default is r0 status_file=/proc/drbd metric=$1 case $metric in version) cat $status_file|grep "version"|awk '{print $2}' ;; name) cat $status_file|grep "cs"|awk '{print $1}'|tr -d ":" ;; cs) cat $status_file|grep "cs"|awk '{print $2}'|awk -F":" '{print $2}' ;; ro) cat $status_file|grep -v "version"|grep "ro"|awk '{print $3}'|awk -F":" '{print $2}' ;; ds) cat $status_file|grep -v "version"|grep "ds"|awk '{print $4}'|awk -F":" '{print $2}' ;; protocol) cat $status_file|grep -v "version"|grep "cs"|awk '{print $5}' ;; ns) cat $status_file|grep "ns"|awk '{print $1}'|awk -F":" '{print $2}' ;; nr) cat $status_file|grep "nr"|awk '{print $2}'|awk -F":" '{print $2}' ;; dw) cat $status_file|grep "dw"|awk '{print $3}'|awk -F":" '{print $2}' ;; dr) cat $status_file|grep "dr"|awk '{print $4}'|awk -F":" '{print $2}' ;; al) cat $status_file|grep "al"|awk '{print $5}'|awk -F":" '{print $2}' ;; bm) cat $status_file|grep "bm"|awk '{print $6}'|awk -F":" '{print $2}' ;; lo) cat $status_file|grep "lo"|awk '{print $7}'|awk -F":" '{print $2}' ;; pe) cat $status_file|grep "pe"|awk '{print $8}'|awk -F":" '{print $2}' ;; ua) cat $status_file|grep "ua"|awk '{print $9}'|awk -F":" '{print $2}' ;; ap) cat $status_file|grep -v "version"|grep "ap"|awk '{print $10}'|awk -F":" '{print $2}' ;; ep) cat $status_file|grep "ep"|awk '{print $11}'|awk -F":" '{print $2}' ;; wo) cat $status_file|grep "wo"|awk '{print $12}'|awk -F":" '{print $2}' ;; oos) cat $status_file|grep "oos"|awk '{print $13}'|awk -F":" '{print $2}' ;; *) echo "unknown parameters" esac
添加zabbix子配置文件drbd_status.conf
UserParameter=drbd.status[*],/usr/local/zabbix/bin/drbd_status.sh $1
三 添加监控模板
这里注意一下触发表达式
{Template DRBD:drbd.status[ro].str(Secondary/Primary)}#1 & {Template DRBD:drbd.status[ro].str(Primary/Secondary)}#1
参考文章:
http://blog.pandorafms.org/?p=1944
http://drbd.linbit.com/docs/about/