分布式服务器监控工具NebulaSolarDash  

详细介绍:https://www.oschina.net/p/nebula-solar-dash

github:   https://github.com/toddlerya/NebulaSolarDash#readme

      工具分为客户端和服务端两部分: 服务端使用了 bottle 来作为 Web 框架,Echarts 来渲染生成图表;客户端使用 Python 原生类库采集服务器资源。

* 以客户端采集数据间隔时间120s为例,单节点24小时会向数据库写入大约4MB数据。
* 单个客户端每次采集发送到服务端写入数据库的信息大概在5~6kb左右,请自行结合服务器个数以及监控时长和服务器存储自行设定监控间隔。


1、下载安装包NebulaSolarDash并解压:

wKiom1mv09Cz-9b_AAA43vPwlc4091.jpg-wh_50

[root@nginx1 ~]# unzip toddlerya-NebulaSolarDash-2.0.1-0-g58fe715.zip 

Archive:  toddlerya-NebulaSolarDash-2.0.1-0-g58fe715.zip

58fe71551f72441964ebcb7bb30fc0e436c9868c

   creating: toddlerya-NebulaSolarDash-58fe715/

  inflating: toddlerya-NebulaSolarDash-58fe715/LICENSE  

  inflating: toddlerya-NebulaSolarDash-58fe715/__init__.py  

   creating: toddlerya-NebulaSolarDash-58fe715/assets/

   creating: toddlerya-NebulaSolarDash-58fe715/assets/css/

  inflating: toddlerya-NebulaSolarDash-58fe715/assets/css/bootstrap.min.css  

  inflating: toddlerya-NebulaSolarDash-58fe715/assets/css/ns_tb.css  

   creating: toddlerya-NebulaSolarDash-58fe715/assets/js/

  inflating: toddlerya-NebulaSolarDash-58fe715/assets/js/bootstrap.min.js  

  inflating: toddlerya-NebulaSolarDash-58fe715/assets/js/dark.js  

  inflating: toddlerya-NebulaSolarDash-58fe715/assets/js/echarts.min.js  

   creating: toddlerya-NebulaSolarDash-58fe715/assets/picture/

  inflating: toddlerya-NebulaSolarDash-58fe715/assets/picture/NebulaSolarDash.gif  

  inflating: toddlerya-NebulaSolarDash-58fe715/assets/picture/NebulaSolarDash2.0.gif  

   creating: toddlerya-NebulaSolarDash-58fe715/conf/

  inflating: toddlerya-NebulaSolarDash-58fe715/conf/__init__.py  

  inflating: toddlerya-NebulaSolarDash-58fe715/conf/ns.ini  

  inflating: toddlerya-NebulaSolarDash-58fe715/init_db.py  

   creating: toddlerya-NebulaSolarDash-58fe715/lib/

  inflating: toddlerya-NebulaSolarDash-58fe715/lib/__init__.py  

  inflating: toddlerya-NebulaSolarDash-58fe715/lib/bottle.py  

  inflating: toddlerya-NebulaSolarDash-58fe715/lib/common_lib.py  

  inflating: toddlerya-NebulaSolarDash-58fe715/manager.py  

  inflating: toddlerya-NebulaSolarDash-58fe715/ns_agent.py  

  inflating: toddlerya-NebulaSolarDash-58fe715/ns_server.py  

  inflating: toddlerya-NebulaSolarDash-58fe715/readme.md  

  inflating: toddlerya-NebulaSolarDash-58fe715/release-note.txt  

  inflating: toddlerya-NebulaSolarDash-58fe715/run.py  

  inflating: toddlerya-NebulaSolarDash-58fe715/start_agent.sh  

  inflating: toddlerya-NebulaSolarDash-58fe715/start_insall_app.sh  

  inflating: toddlerya-NebulaSolarDash-58fe715/stop.py  

  inflating: toddlerya-NebulaSolarDash-58fe715/stop_uninstall_app.sh  

  inflating: toddlerya-NebulaSolarDash-58fe715/uninstall_app.sh  

   creating: toddlerya-NebulaSolarDash-58fe715/views/

  inflating: toddlerya-NebulaSolarDash-58fe715/views/agent_info.tpl  

  inflating: toddlerya-NebulaSolarDash-58fe715/views/each_agent_detail.tpl  


2、修改配置文件即设置server与client:

[root@nginx1 toddlerya-NebulaSolarDash-58fe715]# pwd

/root/toddlerya-NebulaSolarDash-58fe715

[root@nginx1 toddlerya-NebulaSolarDash-58fe715]# vim conf/ns.ini 

[root@nginx1 toddlerya-NebulaSolarDash-58fe715]# cat conf/ns.ini 

[server]

; 服务端IP

ip = 172.25.254.130

; 服务端端口号

port = 8081

debug = True

;报警信息阈值,百分比

;举例:

;cpu_yellow = 80,代表cpu使用率达到80%即提示使用黄色标示

;cpu_red = 95,代表cpu使用率达到95%即提示使用黄色标示

mem_yellow = 80

mem_red = 95

cpu_yellow = 80

cpu_red = 95


[agent]

; 客户端采集数据间隔时间, 单位是s

interval = 60

install_path = /home/RunTimeNSDash

;所有需要监控的节点的ip,以英文逗号分隔

[all_agent_ip]

ips = 172.25.254.134,172.25.254.135


3、出现验证问题,接下来进行无秘钥操作:

[root@nginx1 toddlerya-NebulaSolarDash-58fe715]# python manager.py -install

[+] 开始安装客户端到各个节点并自动启动客户端以及服务端

[+] 设置安装目录成功: /home/RunTimeNSDash

[+] 删除历史数据成功

[+] 启动服务端成功

[+] 此次安装的节点共计 2 个

[09/06/17 18:38:44] : INFO    : 校验服务器连通性: 172.25.254.134

[09/06/17 18:38:44] : INFO    : 开始部署

Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).

[09/06/17 18:38:44] : ERROR   : can not logon 172.25.254.134 without passwd.

[09/06/17 18:38:44] : INFO    : 校验服务器连通性: 172.25.254.135

[09/06/17 18:38:44] : INFO    : 开始部署

Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).

[09/06/17 18:38:44] : ERROR   : can not logon 172.25.254.135 without passwd.

[root@nginx1 toddlerya-NebulaSolarDash-58fe715]# ss

ss           ssh          ssh-agent    sshd         ssh-keygen   ssltap       

sserver      ssh-add      ssh-copy-id  sshd-keygen  ssh-keyscan  

[root@nginx1 toddlerya-NebulaSolarDash-58fe715]# ssh-keygen 

Generating public/private rsa key pair.

Enter file in which to save the key (/root/.ssh/id_rsa): 

/root/.ssh/id_rsa already exists.

Overwrite (y/n)? n

[root@nginx1 toddlerya-NebulaSolarDash-58fe715]# ssh-copy-id root@172.25.254.134

/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed

/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys

root@172.25.254.134's password: 


Number of key(s) added: 1


Now try logging into the machine, with:   "ssh 'root@172.25.254.134'"

and check to make sure that only the key(s) you wanted were added.


[root@nginx1 toddlerya-NebulaSolarDash-58fe715]# ssh-copy-id root@172.25.254.135

/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed

/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys

root@172.25.254.135's password: 


Number of key(s) added: 1


Now try logging into the machine, with:   "ssh 'root@172.25.254.135'"

and check to make sure that only the key(s) you wanted were added.


4、进行安装部署操作

运行参数:

[root@nginx1 toddlerya-NebulaSolarDash-58fe715]# python manager.py -h

usage: manager.py [-h] [-install] [-uninstall] [-startall] [-stopall]

                  [-start START_ONE] [-stop STOP_ONE]

Manager Tool

optional arguments:

  -h, --help        show this help message and exit

  -install          安装客户端到各个节点并自动启动客户端以

                    服务端

  -uninstall        停止各个节点的客户端并停止程序清理安装

                    件,同时停止服务端

  -startall         启动各个节点的客户端并设置crond守护

  -stopall          停止各个节点的客户端并去除crond守护

  -start START_ONE  启动一个指定节点的客户端并设置crond守护

  -stop STOP_ONE    停止一个指定节点的客户端并去除crond守护



[root@nginx1 toddlerya-NebulaSolarDash-58fe715]# python manager.py -install

[+] 开始安装客户端到各个节点并自动启动客户端以及服务端

[+] 设置安装目录成功: /home/RunTimeNSDash

[+] 删除历史数据成功

[+] 启动服务端成功

[+] 此次安装的节点共计 2 个

[09/06/17 18:39:25] : INFO    : 校验服务器连通性: 172.25.254.134

[09/06/17 18:39:25] : INFO    : 开始部署

[09/06/17 18:39:27] : INFO    : 校验服务器连通性: 172.25.254.135

[09/06/17 18:39:27] : INFO    : 开始部署

[root@nginx1 toddlerya-NebulaSolarDash-58fe715]# python manager.py -startall

[+] 启动各个节点的客户端并设置crond守护

[+] 此次安装的节点共计 2 个

[09/06/17 18:40:18] : INFO    : 校验服务器连通性: 172.25.254.134

[09/06/17 18:40:18] : INFO    : 开始部署

[09/06/17 18:40:20] : INFO    : 校验服务器连通性: 172.25.254.135

[09/06/17 18:40:20] : INFO    : 开始部署

[root@nginx1 toddlerya-NebulaSolarDash-58fe715]# lsof -i:8081

COMMAND  PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME

python  7588 root    4u  IPv4  42008      0t0  TCP *:tproxy (LISTEN)



5、实验的验证

wKiom1mv1c3CjEx8AAA9rey7Pfc223.jpg-wh_50


wKiom1mv1oCgpeZ4AAImQFyRTuo614.jpg-wh_50

wKioL1mv1mLgksYJAAJ9lmdEAdc343.jpg-wh_50

Nebua-Solar服务器资源监控节点列表

序号主机名IP地址内存CPU
1host3172.25.254.134-109%0.08%
2web172.25.254.1351%0.45%

节点基础信息 -- 各个图表都可以使用鼠标拖动和滚轮缩放

主机名IP地址CPU内存(MB)SWAP(MB)操作系统内核版本运行时长当前时间
host3172.25.254.1342 x AMD Athlon(tm) X4 730 Quad Core Processor9770CentOS Linux 7.2.1511 Core3.10.0-327.el7.x86_6427 days, 4:51:3720170901-18:35:42

节点磁盘存储信息统计

序号文件系统总大小已用剩余使用率挂载点
1/dev/mapper/centos-root18G2.2G16G13%/
2devtmpfs479M0479M0%/dev
3tmpfs489M0489M0%/dev/shm
4tmpfs489M50M440M11%/run
5tmpfs489M0489M0%/sys/fs/cgroup
6/dev/sda1497M126M372M26%/boot
7tmpfs98M098M0%/run/user/0




20170901-18:24:26
USAGE(%) : 0.08
NICE(%) : 0
USER(%) : 0.01
SYSTEM(%) : 0.06
IOWAIT(%) : 0.01

0.0668



20170901-18:31:27
平均负载值 : 0



节点基础信息 -- 各个图表都可以使用鼠标拖动和滚轮缩放

主机名IP地址CPU内存(MB)SWAP(MB)操作系统内核版本运行时长当前时间
web172.25.254.1352 x AMD Athlon(tm) X4 730 Quad Core Processor18230CentOS Linux 7.2.1511 Core3.10.0-514.26.2.el7.x86_6441 days, 4:19:1020170906-19:02:46

节点磁盘存储信息统计

序号文件系统总大小已用剩余使用率挂载点
1/dev/mapper/centos-root18G12G6.1G66%/
2devtmpfs897M0897M0%/dev
3tmpfs912M144K912M1%/dev/shm
4tmpfs912M99M814M11%/run
5tmpfs912M0912M0%/sys/fs/cgroup
6/dev/sda1497M190M307M39%/boot
7tmpfs183M32K183M1%/run/user/0
8/dev/sr04.1G4.1G0100%/run/media/root/CentOS




20170906-18:57:32
USAGE(%) : 0.45
NICE(%) : 0.01
USER(%) : 0.12
SYSTEM(%) : 0.31
IOWAIT(%) : 0

20170906-18:57:32



20170906-18:55:31
平均负载值 : 0