HTTP服务目前最流行的互联网应用之一,如何监控服务的健康状态对系统运维来说至关重要。
Zabbix本身提供了对WEB应用程序的监控,比如监控WEB程序的Download Speed,Response Time和Response Code等性能指标,但是配置起来比较繁琐和复杂。下面通过 python pycurl模块来获取HTTP响应时间,下载速度,状态吗等性能指标。然后通过zabbix trapper的方式来监控WEB应用的性能。
Zabbix trapper监控是客户端收集监控数据,然后以zabbix_sender的方式发送给zabbix server或者proxy服务器。发送的数据主要包括zabbix server或者proxy主机名,监控项和值。zabbix_sender具体用法如下:
[root@monitor]# /usr/local/zabbix/bin/zabbix_sender -help Zabbix Sender v2.2.3 (revision 44105) (7 April 2014) usage: zabbix_sender [-Vhv] {[-zpsI] -ko | [-zpI] -T -i <file> -r} [-c <file>] Options: -c --config <file> Absolute path to the configuration file -z --zabbix-server <server> Hostname or IP address of Zabbix server -p --port <server port> Specify port number of server trapper running on the server. Default is 10051 -s --host <hostname> Specify host name. Host IP address and DNS name will not work -I --source-address <IP address> Specify source IP address -k --key <key> Specify item key -o --value <key value> Specify value -i --input-file <input file> Load values from input file. Specify - for standard input Each line of file contains whitespace delimited: <hostname> <key> <value> Specify - in <hostname> to use hostname from configuration file or --host argument -T --with-timestamps Each line of file contains whitespace delimited: <hostname> <key> <timestamp> <value> This can be used with --input-file option Timestamp should be specified in Unix timestamp format -r --real-time Send metrics one by one as soon as they are received This can be used when reading from standard input -v --verbose Verbose mode, -vv for more details Other options: -h --help Give this help -V --version Display version number
下面是我用python写的监控脚本,如果要监控多个网站,只需在list列表里面添加即可。
[root@monitor cron]# cat Check_HTTP_Response_Time.py #!/usr/bin/env python #coding=utf-8 #Auth:david import os import sys import fileinput import pycurl import logging hostname = "monitor" #IP from Zabbix Server or proxy where data should be send to. zabbix_server = "192.168.100.200" zabbix_sender = "/usr/local/zabbix/bin/zabbix_sender" #If add url of website, please update list. list = ['www.zmzblog.com','img.zmzblog.com'] #This list define zabbix key. key = ['HTTP_ResSize','HTTP_ResTime','HTTP_ResCode','HTTP_ResSpeed'] #In the file to define the monitor host, key and value. log_file = "/tmp/HTTP_Response.log" logging.basicConfig(filename=log_file,level=logging.INFO,filemode='w') run_cmd="%s -z %s -i %s > /tmp/HTTP_Response.temp" % (zabbix_sender,zabbix_server,log_file) class Test(): def __init__(self): self.contents = '' def body_callback(self,buf): self.contents = self.contents + buf def Check_Http(URL): t = Test() #gzip_test = file("gzip_test.txt", 'w') c = pycurl.Curl() c.setopt(pycurl.WRITEFUNCTION,t.body_callback) #请求采用Gzip传输 #c.setopt(pycurl.ENCODING, 'gzip') try: c.setopt(pycurl.CONNECTTIMEOUT, 60) c.setopt(pycurl.URL,URL) c.perform() except pycurl.error: print "URL %s" % URL Http_Document_size = c.getinfo(c.SIZE_DOWNLOAD) Http_Download_speed = round((c.getinfo(pycurl.SPEED_DOWNLOAD) /1024),2) Http_Total_time = round((c.getinfo(pycurl.TOTAL_TIME) * 1000),2) Http_Response_code = c.getinfo(pycurl.HTTP_CODE) logging.info(hostname +' ' +key[0] + '[' + k + ']' + ' '+str(Http_Document_size)) logging.info(hostname +' ' +key[1] + '[' + k + ']' + ' '+str(Http_Total_time)) logging.info(hostname +' ' +key[2] + '[' + k + ']' + ' '+str(Http_Response_code)) logging.info(hostname +' ' +key[3] + '[' + k + ']' + ' '+str(Http_Download_speed)) def runCmd(command): for u in list: URL = u global k if u.startswith('https:'): k = u.split('/')[2] else: k=u.split('/')[0] Check_Http(URL) for line in fileinput.input(log_file,inplace=1): print line.replace('INFO:root:',''), return os.system(command) runCmd(run_cmd)
添加crontab,定期收集数据并发送给zabbix server服务器。
*/5 * * * * /zabbix/python/cron/Check_HTTP_Response.py
然后在前端配置监控项,可以调用zabbix API批量添加监控项。下面以www.zmzblog.com为例来说明如何监控HTTP的响应时间。这里所有的监控类型都是Zabbix_trapper的方式。监控key HTTP_ResTime[www.zmzblog.com],
HTTP_ResCode[www.zmzblog.com],HTTP_ResSize[www.zmzblog.com],HTTP_ResSpeed[www.zmzblog.com]分别表示HTTP的响应时间,状态吗,文档大小和下载速度。
配置完监控项之后我们配置触发器,因为现在网站的响应时间都是毫秒级别的,如果超过1000ms就报警。
下面分别展示一下HTTP响应时间和状态码,其它的下载速度和文档大小就不展示了。
HTTP响应状态吗。
总结:WEB应用性能监控主要从下面两个方面进行监控。
1)HTTP的响应时间,随着互联网的发展,用户体验提升。网站的打开速度监控一定要快,至少要在毫秒级别。
2)HTTP的状态吗,实时监控网站的响应吗是否正常,是否出现了404,500这样的错误,这种错误是用户无法忍受的,如果出现要第一时间解决。
3)由于网络或者其它原因,为了减少误报,建议用下面的触发器,即检测2次如果状态吗不为200或者大于400的时候报警。
{Template HTTP Response:HTTP_ResCode[www.zmzblog.com].count(#2,200,”ne”)}=2
{Template HTTP Response:HTTP_ResCode[www.zmzblog.com].count(#2,400,”ge”)}=2