Nagios
1.1 Nagios简介
Nagios是一个监视系统运行状态和网络信息的监视系统。Nagios能监视所指定的本地或远程主机以及服务,同时提供异常通知功能等。[1]
Nagios可运行在Linux/Unix平台之上,同时提供一个可选的基于浏览器的WEB界面以方便系统管理人员查看网络状态,各种系统问题,以及日志等等。
Nagios 可以监控的功能有:
1、监控网络服务(SMTP、POP3、HTTP、NNTP、PING等);
2、监控主机资源(处理器负荷、磁盘利用率等);
3、简单地插件设计使得用户可以方便地扩展自己服务的检测方法;
4、并行服务检查机制;
5、具备定义网络分层结构的能力,用"parent"主机定义来表达网络主机间的关系,这种关系可被用来发现和明晰主机宕机或不可达状态;
6、当服务或主机问题产生与解决时将告警发送给联系人(通过EMail、短信、用户定义方式);
7、可以定义一些处理程序,使之能够在服务或者主机发生故障时起到预防作用;
8、自动的日志滚动功能;
9、可以支持并实现对主机的冗余监控;
10、可选的WEB界面用于查看当前的网络状态、通知和故障历史、日志文件等;[1]
11、可以通过手机查看系统监控信息;
系统的安装
1.2 环境设置
同步时间:
crontab –e
*/5 * * * * /usr/sbin/ntpdate pool.ntp.org>/dev/null 2>&1
关闭防火墙:
/etc/init.d/iptables stop
关闭selinux:
[root@olwang-2 etc]# getenforce
Disabled
1.3 服务器端安装:
建立用户和组:
# useradd -s /sbin/nologin nagios
# mkdir /usr/local/nagios
# chown -R nagios.nagios /usr/local/nagios
并将nagios以及apache用户加入到nagcmd组中,确保nagios和apache有权限:
# groupadd nagcmd
# usermod –G nagcmd nagios
# usermod –G nagcmd apache
安装lamp:
yum -y install httpdmysql-server perl-DBI perl-DBD-MySQL php php-devel php-mysqlphp-snmp php-pdophp-gd lm_sensors net-snmp net-snmp-libs net-snmp-utilsnet-snmp-devel
依赖库的安装:
yum install gccyum install gcc glibc glibc-common-y
yum install gd gd-devel -y
yum install mysql-server -y
yum instll httpd php php-gd -y
yum install httpd php php-gd –y
安装nagios:
tar xf nagios.tar.gz
cd nagios
./configure --with-command-group=nagcmd
make all
make install
make install-init
make install-commandmode
make install-config
make install-webconf
/usr/local/nagios/bin/nagios -v/usr/local/nagios/etc/nagios.cfg
当以上安装完毕以后就可以在web界面看到ngios了
1.3.1 安装插件
1.3.1.1 安装插件nagios-plugs
cd nagios-plugins-1.4.16
./configure --with-nagios-user=nagios--with-nagios-group=nagios --enable-perl-modules --with-mysql
Make
Make install
检查插件
ls /usr/local/nagios/libexec/|wc –l
59
1.3.1.2 安装nrpe插件
,这个插件式客户端的插件,因为服务器这台机器也要监控,所以这台机器我们也装上。
tar xf nrpe-2.12.tar.gz
cd nrpe-2.12
./configure
make all
make install-plugin
make install-daemon
make install-daemon-config
ls /usr/local/nagios/libexec/check_nrpe
ls /usr/local/nagios/libexec/|wc –l
60
1.3.1.3 其他插件安装
tar xf Class-Accessor-0.31.tar.gz
cd Class-Accessor-0.31
perl Makefile.PL
make
make install
#
tar xf Config-Tiny-2.12.tar.gz
cd Config-Tiny-2.12
perl Makefile.PL
make
make install
cd ..
###
tar xf Math-Calc-Units-1.07.tar.gz
cd Math-Calc-Units-1.07
perl Makefile.PL
make
make install
cd ..
#
tar xf Nagios-Plugin-0.34.tar.gz
cd Nagios-Plugin-0.34
perl Makefile.PL
make
make install
cd ..
#################
tar xf Params-Validate-0.91.tar.gz
cd Params-Validate-0.91
perl Makefile.PL
make
make install
cd ..
####
tar xf Regexp-Common-2010010201.tar.gz
cd Regexp-Common-2010010201
perl Makefile.PL
make
make install
1.3.2 配置并启动nagios服务
chkconfig nagios on
/etc/init.d/nagios start
echo "/etc/init.d/nagios start">>/etc/rc.local
配置文件验证:
[root@olwang-2 nrpe-2.12]# /etc/init.d/nagioscheckconfig
Running configuration check... OK.
1.4 客户端安装:
1.4.1 环境准备
同步时间:
crontab –e
*/5 * * * * /usr/sbin/ntpdate pool.ntp.org>/dev/null 2>&1
关闭防火墙:
/etc/init.d/iptables stop
关闭selinux:
[root@olwang-2 etc]# getenforce
Disabled
创建用户:
useradd nagios -M -s /sbin/nologin
安装依赖库:
yum install perl-devel perl-CPAN openssl-devel -y
yum install perl-devel openssl-devel –y
1.4.2 插件的安装
1.4.2.1 nagios-plugins安装
tar xf nagios-plugins-1.4.16.tar.gz
cd nagios-plugins-1.4.16
./configure --with-nagios-user=nagios--with-nagios-group=nagios --enable-perl-modules --with-mysql
make
make install
cd ..
插件检查
[root@olwang-2 ~]# ls/usr/local/nagios/libexec/|wc -l
62
1.4.2.2 Nrpe安装
tar xf nrpe-2.12.tar.gz
cd nrpe-2.12
./configure
make all
make install-daemon
make install-daemon-config
make install-plugin
1.4.2.3 Class-Accessor安装
tar xf Class-Accessor-0.31.tar.gz
cd Class-Accessor-0.31
perl Makefile.PL
make
make install
#
tar xf Config-Tiny-2.12.tar.gz
cd Config-Tiny-2.12
perl Makefile.PL
make
make install
cd ..
###
tar xf Math-Calc-Units-1.07.tar.gz
cd Math-Calc-Units-1.07
perl Makefile.PL
make
make install
cd ..
#
tar xf Nagios-Plugin-0.34.tar.gz
cd Nagios-Plugin-0.34
perl Makefile.PL
make
make install
cd ..
#################
tar xf Params-Validate-0.91.tar.gz
cd Params-Validate-0.91
perl Makefile.PL
make
make install
cd ..
####
tar xfRegexp-Common-2010010201.tar.gz
cd Regexp-Common-2010010201
perl Makefile.PL
make
make install
1.4.3 生成登陆web的密码
htpasswd -bc /usr/local/nagios/etc/htpasswd.usersusername password
1.4.4 客户端nrpe启动脚本:
chmod +x /etc/init.d/nrpe
[root@olwang-2 etc]# cat /etc/init.d/nrpe
#/bin/sh
Usage(){
echo "pls input (start|stop|restart)"
}
case $1 in
start)
/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
;;
stop)
pkill nrpe
;;
restart)
pkill nrpe
/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
;;
*)
Usage
esac
1.5 服务器端配置文件
下面目录中的文件只是重要的几个 不是全部。其他不重要的就不在这里列举了。
[root@olwang ngios]# tree/usr/local/nagios/etc/
|--cgi.cfg 配置一些用户权限的文件
|--htpasswd.users 保存用户名和密码的文件
|--nagios.cfg
|--nrpe.cfg 这个文件主要是来配置nrpe模块的具体命令,以及设置语序访问的服务器ip
|--objects 项目目录
| |-- commands.cfg 命令模板
| |-- contacts.cfg 配置邮件
| |-- hosts.cfg 配置客户端的信息
| |-- services
| |-- templates.cfg 模板
/usr/local/nagios/etc/nagios.cfg
增加以下几行
cfg_file=/usr/local/nagios/etc/objects/hosts.cfg
cfg_file=/usr/local/nagios/etc/objects/services.cfg
cfg_dir=/usr/local/nagios/etc/objects/services
注释掉下面一行,此行是监控本地机器的
修改配置文件cgi.cfg
此文件解决nagiosweb界面不显示服务的问题。
[root@olwang-2 etc]# sed -i's#nagiosadmin#oldboy#g' cgi.cfg
[root@olwang-2 etc]# grep oldboy cgi.cfg
authorized_for_system_information=oldboy
authorized_for_configuration_information=oldboy
authorized_for_system_commands=oldboy
authorized_for_all_services=oldboy
authorized_for_all_hosts=oldboy
authorized_for_all_service_commands=oldboy
authorized_for_all_host_commands=oldboy
添加以下文件并给权限
cd objects/
head -51 localhost.cfg >hosts.cfg
chown nagios.nagios hosts.cfg
touch services.cfg
chown nagios.nagios services.cfg
mkdir services
chown nagios.nagios services
配置文件hosts.cfg
# Define a host for the local machine
define host{
use linux-server
host_name olwang-1
alias nagios-client-2
address 192.168.5.130
max_check_attempts 3
normal_check_interval 2
process_perf_data 1
action_url /nagios/pnp/index.php?host=$HOSTNAME$
}
define host{
use linux-server host_name olwang
alias nagios-server
address 192.168.5.129
max_check_attempts 3
normal_check_interval 2
process_perf_data 1
action_url /nagios/pnp/index.php?host=$HOSTNAME$
}
define host{
use linux-server
host_name olwang-2
alias nagios-client-2
address 192.168.5.131
max_check_attempts 3
normal_check_interval 2
process_perf_data 1
action_url /nagios/pnp/index.php?host=$HOSTNAME$
}
#
# HOST GROUP DEFINITION
define hostgroup{
hostgroup_name linux-servers ;The name of the hostgroup
alias Linux Servers ;Long name of the group
members olwang,olwang-1,olwang-2
}
添加监控模板
配置文件commands.cfg
#'check_nrpe'
define command{
command_name check_nrpe
command_line $USER1$/check_nrpe-H "$HOSTADDRESS$" -c $ARG1$ -t 30
}
#'check_mem'
define command{
command_name check_mem
command_line $USER1$/check_mem-w $ARG1$ -c $ARG2$
}
#'check_iostat'
define command{
command_name check_iostat
command_line $USER1$/check_iostat -w $ARG1$ -c $ARG2$
}
define command{
command_name check_weburl
command_line $USER1$/check_http$ARG1$ -w 10 -c 30
}
1.6 客户端配置文件修改
配置文件nrpe.cfg
设置服务器端的ip
77# NOTE: This option is ignored if NRPE is running under either inetd or xinetd
78
79 allowed_hosts=192.168.5.129
这里注释掉199-203,添加205-209.(针对主机性能的监控)
199#command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10
200#command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c30,25,20
201#command[check_hda1]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p/dev/hda1
202#command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10-s Z
203#command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200
204
205command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20
206command[check_disk]=/usr/local/nagios/libexec/check_disk -w 15% -c 7% -p /
207command[check_swap]=/usr/local/nagios/libexec/check_swap -w 20% -c 10%
208command[check_iostat]=/usr/local/nagios/libexec/check_iostat -w 6 -c 10
209command[check_mem]=/usr/local/nagios/libexec/check_memory.pl -w 10% -c 3%
启动客户端守护进程
/usr/local/nagios/bin/nrpe -c/usr/local/nagios/etc/nrpe.cfg –d
1.7报错
错误日志:
[1477273522] SERVICE NOTIFICATION:nagiosadmin;olwang-2;Disk Partition;UNKNOWN;notify-service-by-email;Invalidhost name -c
[1477273552] SERVICE NOTIFICATION:nagiosadmin;olwang-1;Disk Iostat;UNKNOWN;notify-service-by-email;Invalid hostname -c
[1477273602] SERVICE NOTIFICATION:nagiosadmin;olwang-2;Iostat;UNKNOWN;notify-service-by-email;Invalid host name-c
[1477273652] SERVICE NOTIFICATION:nagiosadmin;olwang-1;Disk Partition;UNKNOWN;notify-service-by-email;Invalidhost name -c
[1477273702] SERVICE NOTIFICATION:nagiosadmin;olwang-2;Load;UNKNOWN;notify-service-by-email;Invalid host name -c
[1477273752] SERVICE NOTIFICATION:nagiosadmin;olwang-1;MEM Usage;UNKNOWN;notify-service-by-email;Invalid hostname -c
[1477273802] SERVICE NOTIFICATION:nagiosadmin;olwang-2;MEM Usage;UNKNOWN;notify-service-by-email;Invalid hostname -c
[1477273852] SERVICE ALERT:olwang-1;Ping;OK;HARD;3;PING OK - Packet loss = 0%, RTA = 1.95 ms
[1477273952] SERVICE NOTIFICATION:nagiosadmin;olwang-1;Swap Usage;UNKNOWN;notify-service-by-email;Invalid hostname -c
[1477274002] SERVICE NOTIFICATION: nagiosadmin;olwang-2;SwapUsage;UNKNOWN;notify-service-by-email;Invalid host name -c
[1477274052] SERVICE NOTIFICATION:nagiosadmin;olwang-1;Current Load;UNKNOWN;notify-service-by-email;Invalid hostname -c
解决办法:
Hosts解析问题。
修改文件/etc/hosts
问题2:
[root@olwang-2 ~]#/usr/local/nagios/libexec/check_memory
-bash:/usr/local/nagios/libexec/check_memory: /usr/bin/perl^M: bad interpreter: Nosuch file or directory
问题总结:
在*nix系统下使用Perl脚本有时会遇到如下错误:
/usr/bin/perl^M: bad interpreter: No such file ordirectory
最常见的原因是因为该脚本在windows系统下进行了编辑。
windows系统下的换行符是\r\n,而unix下面是只有\n的。如果要解决这个问题,只要去掉\r即可。
第一种解决方案是用sed(假设出问题的脚本名叫filename):
解决办法:
sed-i 's/\r$//' /usr/local/nagios/libexec/check_memory
问题3:
解决办法:
遇到这个问题,首先要检查我们是否咱装了openssl openssl-devel,如果检查没问题。
下面就来检查一下客户端配置文件/usr/local/nagios/etc/nrpe.cfg
若以上两个问题解决,这个报错也就解决了。