
实验环境
服务器ip | 部署服务 | 服务端口号 |
prometheus_server 192.168.10.42 | prometheus node-exporter cAdvisor alertmanager grafana | prometheus:9090 pro服务端程序 node-exporter:9100 监控主机信息 cAdvisor:8080 收集容器信息 alertmanager:9093 报警模块 grafana:3000 web图形展示 |
prometheus_node 192.168.10.45 | node-exporter cAdvisor | node-exporter:9100 监控主机信息 cAdvisor:8080 收集容器信息 |
cat /etc/redhat-release && uname -a
CentOS Linux release 7.8.2003 (Core)
Linux centos7-1 3.10.0-1127.el7.x86_64
systemctl stop firewalld && systemctl disable firewalld
echo SELINUX=disabled > /etc/sysconfig/selinux
软件安装
yum install -y yum-utils docker-ce-18.06.3.ce chrony
systemctl daemon-reload
systemctl enable chronyd docker --now && hwclock -w
docker --version
Docker version 18.06.3-ce, build 6d37f41
docker pull prom/prometheus
docker pull prom/node-exporter
docker pull prom/alertmanager
docker pull google/cadvisor:v0.33.0
docker pull grafana/grafana
配置prometheus服务端
touch /usr/local/docker/prometheus/prometheus.yml
cat /usr/local/docker/prometheus/prometheus.yml
global:
scrape_interval: 15s
evaluation_interval: 15s 采集数据时间
alerting:
alertmanagers:
- static_configs:
- targets: ["192.168.10.42:9093"]
rule_files:
- "/etc/prometheus/*_rules.yml" 容器内部ruler配置路径
scrape_configs:
- job_name: "promeserver" 主节点 promeserver服务端主机信息采集
static_configs:
- targets: ["192.168.10.42:9100",192.168.10.45:9100"]
- job_name: "cadvisor" 主节点 cadvisor 容器信息采集
static_configs:
- targets: ["192.168.10.42:8080","192.168.10.45:8080"]
docker run -itd --name promserver \
--restart=always -p 9090:9090 \
-v /usr/local/docker/prometheus/:/etc/prometheus \
-v /etc/localtime:/etc/localtime --net=host prom/prometheus \
--config.file=/etc/prometheus/prometheus.yml \
--web.enable-lifecycle && docker logs -f promserver | grep 9090
caller=web.go:570 level=info component=web msg="Start listening for connections" address=0.0.0.0:9090 查看promserver日志
docker exec -it --user=root prometheus /bin/sh -c 'promtool check config /etc/prometheus/prometheus.yml'
Checking /etc/prometheus/prometheus.yml
SUCCESS: 0 rule files found 检测prometheus 配置文件语法 0为没有问题


配置prometheus服务端/客户端
docker run -itd --restart=always -p 9100:9100 \
-v /etc/localtime:/etc/localtime \
--name=promnode --net=host \
prom/node-exporter && docker logs -f promnode | grep 9100
caller=node_exporter.go:199 level=info msg="Listening on" address=:9100 prometheus_server/prometheus_node部署


配置容器分析工具cadvisor
docker run -itd --name cadvisor \
--restart=always -p 8080:8080 \
-v /etc/localtime:/etc/localtime \
-v /:/rootfs:ro -v /var/run/:/var/run/:rw \
-v /sys/:/sys/:ro -v /var/lib/docker/:/var/lib/docker/:ro \
-v /dev/disk/:/dev/disk/:ro \
--net=host google/cadvisor:v0.33.0 && docker logs -f cadvisor
查看cadvisor容器日志 prometheus_server/prometheus_node部署
prometheus_server/prometheus_node部署

配置邮件报警
cat /usr/local/docker/alertmanager/alertmanager.yml
global:
resolve_timeout: 5m 检测时间
smtp_from: 'xxx@' 发送邮箱
smtp_smarthost: 'smtp.:465' 邮箱smtp地址 必须带端口号
smtp_auth_username: 'xxx@' 发送邮件账户名
smtp_auth_password: 'MXVTTTZNKCWKFEER' 邮件激活码 非登入密码
smtp_require_tls: false
route:
group_by: ['alert']
group_wait: 5s
group_interval: 5s
repeat_interval: 5s
receiver: 'email'
receivers:
- name: 'email'
email_configs:
- to: 'xxxx@' 发送邮箱地址
send_resolved: true 立即发送
touch cat /usr/local/docker/prometheus/node_rules.yml
cat /usr/local/docker/prometheus/node_rules.yml
groups:
- name: node-up
rules:
- alert: node-up
expr: up{job="promserver"} == 0
for: 5s
labels:
severity: 1
team: node
annotations:
summary: "{{ $labels.instance }} 已停止运行超过 5s!"
https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/ 报警规则参考官网连接
docker run -itd --name alert \
--restart=always -p 9093:9093 \
-v /usr/local/docker/alertmanager/alertmanager.yml:/etc/alertmanager/alertmanager.yml \
-v /etc/localtime:/etc/localtime --net=host prom/alertmanager \
--config.file=/etc/alertmanager/alertmanager.yml && docker logs -f alert | grep 9093
docker exec -it --user=root alert /bin/sh -c 'amtool check-config /etc/alertmanager/alertmanager.yml' alertmanager语法检测
Checking '/etc/alertmanager/alertmanager.yml' SUCCESS
Found:
- global config
- route
- 1 inhibit rules
- 1 receivers
- 0 templates



测试报警邮件报警
docker stop promnode 测试效果 停止 node-exporter容器


登入163邮箱查看是否收到报警邮件

以上操作均为prometheus_server
配置图形化工具grafana
docker run -itd -p 3000:3000 \
--restart=always -v /etc/localtime:/etc/localtime \
--name grafana --net=host grafana/grafana && docker logs -f grafana | grep 3000
logger=http.server address=[::]:3000 protocol=http subUrl= socket=
查看grafana日志 prometheus_server部署
netstat -tuplna | grep LISTEN
tcp6 0 0 :::9090 :::* LISTEN 57258/prometheus
tcp6 0 0 :::9100 :::* LISTEN 102907/node_exporte
tcp6 0 0 :::8080 :::* LISTEN 57855/cadvisor
tcp6 0 0 :::9093 :::* LISTEN 60201/alertmanager
tcp6 0 0 :::9094 :::* LISTEN 128823/alertmanager
tcp6 0 0 :::3000 :::* LISTEN 52855/grafana-serve

用户名 admin 密码 admin






其他模板从grafana官网下载导入模板即可



















