安装
server 192.168.12.145
client 192.168.12.144
服务器端安装
nagios-3.3.1.tar.gz
nagios-plugins-1.4.5
安装环境
yum install httpd gcc gcc-c++ openssl openssl-devel perl
添加nagios用户和组
useradd nagios ##若是指定-s 为nologin,则重启nagios的时候可能会报错,但是不影响功能
groupadd nagcmd
usermod -G nagcmd nagios
usermod -G nagcmd apache
安装nagios
tar -xf nagios-2.2.1.tar.gz -C /usr/local/src
cd /usr/local/src/nagios-2.2.1
./configure --with-command-group=nagcmd
make all
make install ##安装主程序
make install-init ##安装启动脚本
make install-config ##创建配置文件
make install-commandmode ##配置目录权限
make install-webconf
定义联系人
vi /usr/local/nagios/etc/objects/contacts.cfg
定义联系人和组
创建一个用户用于在web页面登陆nagios
htpasswd -c /usr/locl/nagios/etc/httpasswd.user nagiosadmin
重启apache
chown -R nagios:nagios /usr/local/nagios/etc/htpasswd.users
若权限不对可能造成web界面没有权限打开该文件,造成无法登陆
编辑http.conf配置文件
scriptAlias "/nagios/cgi-bin""/usr/local/nagios/sbin"
<Directory"/usr/local/nagios/sbin">
Options ExecCGI
AllowOverride None
Order allow,deny
Allow from all
AuthName "Nagios Access"
AuthType Basic
AuthUserFile /usr/local/nagios/etc/htpasswd.users ##此处为之前建的用户认证文件位置
Require valid-user
</Directory>
Alias /nagios"/usr/local/nagios/share"
<Directory"/usr/local/nagios/share">
Options None
AllowOverride None
Order allow,deny
Allow from all
AuthName "Nagios Access"
AuthType Basic
AuthUserFile /usr/local/nagios/etc/htpasswd.users
Require valid-user
</Directory>
重启apache
安装nagios-plugins-1.4.15
./configure --prefix=/usr/local/nagios --with-nagios-user=nagios --with-nagios-group=nagios
make && make install
验证配置文件是否有错
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
关闭selinux
在浏览器输入http://ip/nagios 输入htpasswd命令生成的账户密码即可以访问
注意点:php是否安装 apache是否加载了php模块
错误:
apache access.log
10.13.115.1 - nagiosadmin[26/Jun/2011:13:01:10 +0800] "GET /nagios/side.php HTTP/1.1" 304
10.13.115.1- nagiosadmin [26/Jun/2011:13:01:10 +0800] "GET /nagios/p_w_picpaths/sblogo.pngHTTP/1.1" 304 –
10.13.115.1- nagiosadmin [26/Jun/2011:13:01:10 +0800] "GET/nagios/stylesheets/common.css HTTP/1.1" 304
10.13.115.1- nagiosadmin [26/Jun/2011:13:01:10 +0800] "GET/nagios/p_w_picpaths/greendot.gif HTTP/1.1" 304
apace error.log
[notice]Apache/2.2.19 (Unix) DAV/2 configured -- resuming normal operations
[error][client xx.xx.115.1] Directory index forbidden by Options directive: /usr/local/nagios/share
[error][client xx.xx.115.1] Directoryindex forbidden by Options directive: /usr/local/nagios/share
[error][client xx.xx.115.1] Directoryindex forbidden by Options directive: /usr/local/nagios/share
[error][client xx.xx.115.1] Directoryindex forbidden by Options directive: /usr/local/nagios/share
[error][client xx.xx.115.1] Directoryindex forbidden by Options directive: /usr/local/nagio/share
解决办法:
php页面无法被apache解析
解决:
在apache的配置文件httpd.conf修改
第一步:
<IfModule dir_module>
DirectoryIndex index.html
</IfModule>
改成
<IfModule dir_module>
DirectoryIndex index.php index.html
</IfModule>
第二步:
增加如下内容,记得装php
LoadModule php5_module modules/libphp5.so
AddType application/x-httpd-php .php .phtml
AddType applicatoin/x-httpd-php-source .phps
问题2
输入账户密码无法登陆
查看apache配置文件添加的用户认证那一段
查看/usr/localnagios/etc/下面的用户账户密码文件htpasswd.users
发现apache的用户和nagios下面的用户文件名不一致,修改其中一个即可
问题3
[Tue Jan 27 01:51:51 2015] [error] [client 127.0.0.1] Directory index forbidden by Options directive: /var/www/html/
[Tue Jan 27 01:51:51 2015] [error] [client 192.168.12.27] (13)Permission denied: exec of '/usr/local/nagios/sbin/status.cgi' failed, referer: http://192.168.12.149/nagios/side.php
[Tue Jan 27 01:51:51 2015] [error] [client 192.168.12.27] Premature end of script headers: status.cgi, referer: http://192.168.12.149/nagios/side.php
[Tue Jan 27 01:51:55 2015] [error] [client 192.168.12.27] (13)Permission denied: exec of '/usr/local/nagios/sbin/status.cgi' failed, referer: http://192.168.12.149/nagios/side.php
[Tue Jan 27 01:51:55 2015] [error] [client 192.168.12.27] Premature end of script headers: status.cgi, referer: http://192.168.12.149/nagios/side.php
[Tue Jan 27 01:52:07 2015] [error] [client 192.168.12.27] (13)Permission denied: exec of '/usr/local/nagios/sbin/status.cgi' failed, referer: http://192.168.12.149/nagios/side.php
[Tue Jan 27 01:52:07 2015] [error] [client 192.168.12.27] Premature end of script headers: status.cgi, referer: http://192.168.12.149/nagios/side.php
[Tue Jan 27 01:52:08 2015] [error] [client 192.168.12.27] (13)Permission denied: exec of '/usr/local/nagios/sbin/status.cgi' failed, referer:
http://192.168.12.149/nagios/side.php
此问题导致nagios访问页面无法显示,原因selinux没关
问题4
u're trying to run actually exists.
Jan 27 02:30:42 localhost nagios: Warning: Return code of 127 for check of service 'LOAD' on host '192.168.12.145' was out of bounds. Make sure the plugin you're trying to run actually exists.
Jan 27 02:31:42 localhost nagios: Warning: Return code of 127 for check of service 'LOAD' on host '192.168.12.145' was out of bounds. Make sure the plugin you're trying to run actually exists.
用check_nrpe进行监控的需要在监控命令前加上check_nrpe,如下
check_command check_nrpe!check_nrpe_load
监控端服务器安装nrpe
需要安装openssl和openssl-devel,不然会出错
tar –xf nrpe-2.13.tar.gz
cd nrpe-2.12
./configure –enable-ssl –with-ssl-lib
make all
make install-plugin
make install-daemon
make install-daemon-config
下面这条监控端可以不用执行(监控端nrpe只需要安装即可),被监控端的nrpe必须启动
/usr/local/nagios/bin/nrpe –c/usr/local/nagios/etc/nrpe.cfg –d启动nrpe,或者将nrpe嵌入xinetd中,由xinetd启动
必须关闭防火墙,可以用service iptables status查看规则,service iptablesstop关闭防火墙,或者配置一条规则允许5666号端口
被监控端安装
必须安装openssl和openssl-devel,否则会出错
tar –xf nagios-plugins-1.3.13.tar.gz
cd nagios-plugins-1.3.13
useradd nagios
cd nagios-plugins-1.4.13
./configure –with-nagios-user=nagios –with-nagios-group=nagios
make
make install
chown –R nagios:nagios /usr/local/nagios
安装nrpe
tar –xf nrpe-2.12.tar.gz
cd nrpe-2.12
./configure –enable-ssl –with-lib
make all
make install-plugin
make install-daemon
make install-daemon-config
make install-xinetd 将nrpe嵌入xinetd服务(也可以用上面的我上面用到的命令启动)
vim /usr/local/nagios/etc/nrpe.cfg
修改allowed_host=127.0.0.1为allowed_host=127.0.0.1,192.168.12.145(监控服务器的ip)
/usr/local/nagios/libexec/check_nrpe –Hlocalhost
如果出现 NRPE v2.12则说明安装成功
如果出现connection refused by host 则需要安装openssl或者是防火墙或selinux没关
修改nagios下面的object里面的commands.cfg文件增加check_nrpe的定义
definecommand{
command_name check_nrpe
command_line $USER1$/check_nrpe –H $HOSTADDRESS$ -c $ARG1$
}
在监控机上测试与被监控机的通讯是否正常
/usr/local/nagios/libexec/check_nrpe –Hlocalhost –c check_load
OK - loadaverage: 0.05, 0.06, 0.00|load1=0.050;15.000;30.000;0;load5=0.060;10.000;25.000;0; load15=0.000;5.000;20.000;0;
则说明正常
nagios配置文件
nagios安装后下面共有bin etc libexec sbin share var几个目录
etc存放配置文件
bin存放nagios的命令
sbin存放通过web外部方式执行的cgi
libexec存放的是所有插件
var存放log和pid文件
sharenagios网页文件目录,存放一些html文件
若是要添加web登陆的用户,而不是用默认的nagiosadmin,则需要在etc/cgi.cfg的authorized_for_system_commands=nagiosadmin后面加上需要添加到用户,如:nagiosadmin,henshui 具体添加条目需要百度
在nagios.cfg添加一些目录存放自己设定的相关配置文件,找到cfg_file部分
根据自己添加到目录设定配置文件存放的位置,例如:
cfg_file=/usr/local/nagios/etc/objects/commands.cfg
配置文件里面的$USER1$和$USER$是变量,因原路径太长,定义变量简化(变量是在resource.cfg中定义的)
$USER1$=/usr/local/nagios/libexec
$USER7$=-C mypublic -2 (snmp插件的命令行中参数需要设置snmp相关的信息,为节省输入,在此定义变量)
$HOSTADDRESS$为我们下面要定义的主机,即我们定义的检测对象。他的值就是主机IP地址
$ARG1$代表参数1,$ARG2$代表参数2
可以在service中定义组,属于这个组的主机都会被监控
check_command check_snmp_storage!-m "^VirtualMemory$"!70!90 定义监控命令
这里的!代表带入参数,几个!代表接收几个参数,每个参数之间用!分隔
cfg_file=/usr/local/nagios/etc/objects/hosts.cfg 新建hosts.cfg文件,存放主机与主机组定义
cfg_file=/usr/local/nagios/etc/objects/services.cfg 新建services.cfg文件,存放服务与服务组定义
cfg_file=/usr/local/nagios/etc/objects/localhost.cfg 定义监控本机的状态
cfg_file=/usr/local/nagios/etc/objects/timeperiods.cfg 定义什么时间段进行监控
commands.cfg 命令配置文件
contacts.cfg 联系人配置文件
#定义服务出了状况通知的时间段,这个时间段就是在timeperiods.cfg中定义的
service_notification_period24x7
#主机出了状况通知的时间段,同上
host_notification_period24x7
#当服务出现w-(warning),u-(unkown),c-(critical),或者r-从一擦汗那个情况恢复正常,通知联系人
service_notification_optionsw,u,c,r
#当主机出现d-(down),u-(unreachable),r
#################################################################
此处可以无视
#################################################################
define host{
use linux-server
host_name 192.168.12.145
alias 192.168.12.145
address 192.168.12.145
}
define service{
use generic-service
host_name 192.168.12.145
service_description check_ping
check_command check_ping!100.0,20%!200.0,50%
max_check_attempts 5
normal_check_interval 1
}
define service{
use generic-service
host_name 192.168.12.145
service_description LOAD
check_command check_nrpe!check_nrpe_load
max_check_attempts 5
normal_check_interval 1
}
define service{
use generic-service
host_name 192.168.12.145
service_description check_ssh
check_command check_ssh
max_check_attempts 5
normal_check_interval 1