前言

最近搭建了机房内网互通,急需要监控网络的质量,经过同事的推荐,体验了一下smokeping这款开源软件。其中我认为最大的亮点是支持分布式监控,master/slaves,多个节点的监控数据可以在同一个图上展现。事不宜迟,开始介绍部署吧、

部署

1.master的部署

master负责提供web界面,cgi接口。其中也有些坑后面会讲述。

a.先加入epel,rpmforge的源吧,方便yum安装

rpm –Uvh http://mirrors.neusoft.edu.cn/epel/6/i386/epel-release-6-8.noarch.rpm

rpm –Uvh http://pkgs.repoforge.org/rpmforge-release/rpmforge-release-0.5.2-1.el6.rf.x86_64.rpm

b.安装依赖包

yum -y install httpd fping echoping curl  rrdtool perl perl-Net-Telnet perl-Net-DNS perl-LDAP perl-libwww-perl perl-RadiusPerl perl-IO-Socket-SSL perl-Socket6 perl-CGI-SpeedyCGI perl-devel perl-FCGI.x86_64 perl-CGI.x86_64 rrdtool-perl.x86_64

c.有一个perl模块找不到rpm,用CPAN装上吧

wget http://search.cpan.org/CPAN/authors/id/D/DS/DSCHWEI/Config-Grammar-1.10.tar.gz

tar -zxvf Config-Grammar-1.10.tar.gz

cd Config-Grammar-1.10 

perl Makefile.PL 

make && make install

d.开始安装smokeping

wget http://oss.oetiker.ch/smokeping/pub/smokeping-2.6.9.tar.gz

tar -zxvf smokeping-2.6.9.tar.gz

cd smokeping-2.6.9

./configure

make && make install

e.初始化环境

cp /opt/smokeping-2.6.9/etc/config.dist  /opt/smokeping-2.6.9/etc/config

mkdir -p /opt/smokeping-2.6.9/cache

mkdir -p /opt/smokeping-2.6.9/data

mkdir -p /opt/smokeping-2.6.9/var

chmod 400 /opt/smokeping-2.6.9/etc/smokeping_secrets.dist  #这个密钥用于master与slaves验证用,后面会说

chown apache.apache /opt/smokeping-2.6.9/etc/smokeping_secrets.dist  

chown apache.apache -R /opt/smokeping-2.6.9/

f.配置apache

vim /etc/httpd/conf.d/smokeping.conf

Alias /smokeping/cache/ /opt/smokeping-2.6.9/cache/
Alias /smokeping/ /opt/smokeping-2.6.9/htdocs/
<Directory /opt/smokeping-2.6.9/htdocs/>
   Allow from all
   Options ExecCGI
 AddHandler cgi-script .fcgi
 <IfModule dir_module>
   DirectoryIndex smokeping.fcgi
   </IfModule>
</Directory>

g.单节点的部署完了,测试跑一下

/etc/init.d/httpd start

sudo -u apache /opt/smokeping-2.6.9/bin/smokeping

2.部署master/slaves

现在开始部署分布式监控

需要注意以下几点:

1)smokeping的slaves机器不需要配置httpd且不需要配置文件,它会去master上取。

2)master与slaves需要通过校验文件校验

3)master的httpd进程必须与smokeping进程是相同的启动用户(在本文用apache用户)。不然会出现slaves采集的数据,一直不会更新rrd.图上一直是Nan

a.master端

1)在master上添加配置,添加以下内容到配置文件

/opt/smokeping-2.6.9/etc/config

*** Slaves ***

#此文件就是校验文件
secrets=/opt/smokeping-2.6.9/etc/smokeping_secrets.dist 

+ 节点的hostname
display_name=节点的hostname

location=China
color=0000ff

2)生成master校验文件

vim /opt/smokeping-2.6.9/etc/smokeping_secrets.dist

节点的hostname:密钥

3)重启smokeping服务

b.在slaves端

和master一样安装包和初始化环境,这里就不重复了

1)生成slaves的校验文件

echo "上面master校验文件中定义的密钥" > /opt/smokeping-2.6.9/etc/slave_secret.txt

chmod 400 /opt/smokeping-2.6.9/etc/slave_secret.txt

2)启动slaves服务

注意:

slaves的hostname必须与master校验文件和配置文件中的对上

/opt/smokeping-2.6.9/bin/smokeping --master-url=http://你的masterIP/smokeping/smokeping.fcgi --shared-secret=/opt/smokeping-2.6.9/etc/slave_secret.txt --cache-dir=/opt/smokeping-2.6.9/cache/

 

master的配置文件参考

cat /opt/smokeping-2.6.9/etc/config

# This Smokeping example configuration file was automatically generated.
#
# Everything up to the Probes section is derived from a common template file.
# See the Probes and Targets sections for the actual example.
#
# This example is included in the smokeping_examples document. 

*** General *** 

owner    = lxcong
contact  = lxcong@osc.net
mailhost = 127.0.0.1
sendmail = /usr/sbin/sendmail
# NOTE: do not put the Image Cache below cgi-bin
# since all files under cgi-bin will be executed ... this is not
# good for images.
imgcache = /opt/smokeping-2.6.9/cache
imgurl   = cache
datadir  = /opt/smokeping-2.6.9/data
piddir  = /opt/smokeping-2.6.9/var
cgiurl   = http://http://yoururl/smokeping.cgi
smokemail = /opt/smokeping-2.6.9/etc/smokemail.dist
tmail = /opt/smokeping-2.6.9/etc/tmail.dist
# specify this to get syslog logging
syslogfacility = local0
# each probe is now run in its own process
# disable this to revert to the old behaviour
# concurrentprobes = no
*** Alerts ***
to = lxcong@osc.net
from = ×××@smokeping.cn 

+someloss
type = loss
# in percent
pattern = >0%,*12*,>0%,*12*,>0%
comment = loss 3 times  in a row 

+rttbad
type = rtt
# in milliseconds
pattern = ==S,>50,>50
comment = route 

+rttdetect
type = rtt
# in milliseconds
pattern = <10,<10,<10,<10,<10,<100,>100,>100,>100
comment = routing messed up again ? 

*** Database *** 

step     = 60
pings    = 20 

# consfn mrhb steps total 

AVERAGE  0.5   1  1008
AVERAGE  0.5  12  4320
    MIN  0.5  12  4320
    MAX  0.5  12  4320
AVERAGE  0.5 144   720
    MAX  0.5 144   720
    MIN  0.5 144   720 

*** Presentation *** 

template = /opt/smokeping-2.6.9/etc/basepage.html.dist
charset = utf-8
+ charts 

menu = Charts
title = The most interesting destinations 

++ stddev
sorter = StdDev(entries=>4)
title = Top Standard Deviation
menu = Std Deviation
format = Standard Deviation %f 

++ max
sorter = Max(entries=>5)
title = Top Max Roundtrip Time
menu = by Max
format = Max Roundtrip Time %f seconds 

++ loss
sorter = Loss(entries=>5)
title = Top Packet Loss
menu = Loss
format = Packets Lost %f 

++ median
sorter = Median(entries=>5)
title = Top Median Roundtrip Time
menu = by Median
format = Median RTT %f seconds 

+ overview 

width = 600
height = 50
range = 10h 

+ detail 

width = 600
height = 200
unison_tolerance = 2 

"Last 3 Hours"    3h
"Last 30 Hours"   30h
"Last 10 Days"    10d
"Last 400 Days"   400d 

#+ hierarchies
#++ owner
#title = Host Owner
#++ location
#title = Location 

# (The actual example starts here.) 

*** Probes *** 

# Here we have just one probe, fping, pinging four hosts. 
# 
# The fping probe is using the default parameters, some of them supplied
# from the Database section ("step" and "pings"), and some of them by
# the probe module. 

+FPing
binary = /usr/sbin/fping 

*** Targets *** 

# The hosts are located in two sites of two hosts each, and the
# configuration has been divided to site sections ('+') and host subsections
# ('++') accordingly.
probe = FPing 

menu = Top
title = Network Latency Grapher
remark = Welcome to this SmokePing website. 

alerts = rttbad,someloss 

###################东莞×××配置#########################
+ DG_×××
menu = 东莞×××
title = 东莞×××
host = /DG_×××/DG_×××1/To_ZW_1~dg.vpn1 /DG_×××/DG_×××1/To_ZW_2~dg.vpn1 /DG_×××/DG_×××1/To_SD~dg.vpn1 /DG_×××/DG_×××2/To_ZW_1~dg.vpn2 /DG_×××/DG_×××2/To_ZW_2~dg.vpn2 /DG_×××/DG_×××2/To_SD~dg.vpn2 

++ DG_×××1
nomasterpoll=yes
menu = 东莞×××1
title = 东莞×××1
#这里就是应用dg.vpn1这个slaves去做监控
slaves = dg.vpn1
host = /DG_×××/DG_×××1/To_ZW_1~dg.vpn1 /DG_×××/DG_×××1/To_ZW_2~dg.vpn1 /DG_×××/DG_×××1/To_SD~dg.vpn1
+++ To_ZW_1
menu = 去往兆维线路1
title = 去往兆维线路1
host = 192.168.200.1 
+++ To_ZW_2
menu = 去往兆维线路2
title = 去往兆维线路2
host = 192.168.201.1
+++ To_SD
menu = 去往上地线路
title = 去往上地线路
host = 192.168.202.1 

++ DG_×××2
nomasterpoll=yes
menu = 东莞×××2
title = 东莞×××2
slaves = dg.vpn2
host = /DG_×××/DG_×××2/To_ZW_1~dg.vpn2 /DG_×××/DG_×××2/To_ZW_2~dg.vpn2 /DG_×××/DG_×××2/To_SD~dg.vpn2
+++ To_ZW_1
menu = 去往兆维线路1
title = 去往兆维线路1
host = 192.168.200.1 
+++ To_ZW_2
menu = 去往兆维线路2
title = 去往兆维线路2
host = 192.168.201.1
+++ To_SD
menu = 去往上地线路
title = 去往上地线路
host = 192.168.202.1 

*** Slaves ***
secrets=/opt/smokeping-2.6.9/etc/smokeping_secrets.dist 

+ dg.vpn1
display_name=dg.vpn1
location=China
color=000fff 

+ dg.vpn2
display_name=dg.vpn2
location=China
color=000ff1 

++override
Probes.FPing.binary = /usr/sbin/fping

smokeping页面和rrd图片支持中文

默认不支持中文,不过修改也很简单

1.页面支持中文

在配置文件的*** Presentation ***下添加

charset = utf-8

2.rrd图片支持中文

yum -y install wqy-zenhei-fonts.noarch

修改配置文件/opt/smokeping-2.6.9/lib/Smokeping/Graphs.pm ,插入这句'--font TITLE:20:"WenQuanYi Zen Hei Mono"',

            my ($graphret,$xs,$ys) = RRDs::graph
            ("dummy",
            '--start', $tasks[0][1],
            '--end', $tasks[0][2],
            '--font TITLE:20:"WenQuanYi Zen Hei Mono"',
            "DEF:maxping=$cfg->{General}{datadir}${host}.rrd:median:AVERAGE",
            'PRINT:maxping:MAX:%le' );