Prometheus之Alertmanager邮件报警配置

原创

小吉猫w 2022-10-24 11:18:06 博主文章分类：Prometheus ©著作权

©著作权归作者所有：来自51CTO博客作者小吉猫w的原创作品，请联系作者获取转载授权，否则将追究法律责任

一 Alertmanager配置

1.1 编辑Alertmanager配置文件

点击查看代码

root@node-02:~# cat /usr/local/alertmanager/alertmanager.yml 
global:
  smtp_from: '1304995320@qq.com' 
  smtp_smarthost: 'smtp.qq.com:465'
  smtp_auth_username: '1304995320@qq.com'
  smtp_auth_password: 'xxxxxxxxx'
  smtp_hello: '@qq.com'
  smtp_require_tls: false 
route:
  group_by: ['alertname']
  group_wait: 30s
  group_interval: 5s
  repeat_interval: 1m
  receiver: 'web.hook'
receivers:
- name: 'web.hook'
  email_configs:
    - to: '1304995320@qq.com'
inhibit_rules:
  - source_match:
      alertname: InstanceDown
      severity: 'critical'
    target_match:
      alertname: InstanceDown
      severity: 'critical'
    equal: ['alertname', 'dev', 'instance']

1.2 重启Alertmanager服务

root@node-02:~# systemctl restart alertmanager

二 Prometheus报警设置

2.1 修改Prometheus配置文件

root@prometheus-01:~# cat /usr/local/prometheus/prometheus.yml 
alerting:
  alertmanagers:
    - static_configs:
        - targets:
           - 192.168.174.104:9093
rule_files:

"rules/*.yaml"
"alert_rules/*.yaml"

2.2 创建告警规则文件

root@prometheus-01:~# cat /usr/local/prometheus/alert_rules/instance_down.yaml 
groups:
- name: ALLInstances
  rules:
    - alert: InstanceDown
      expr: up == 0
      for: 1m
      annotations:
        title: 'Instance down'
        description: 'Instance has been down for more than 1 munute.'
      labels:
        severity: 'critical'

2.3 验证规则

root@prometheus-01:~# /usr/local/prometheus/promtool check rules /usr/local/prometheus/alert_rules/instance_down.yaml 
Checking /usr/local/prometheus/alert_rules/instance_down.yaml
  SUCCESS: 1 rules found

2.4 重启Prometheus服务

root@prometheus-01:~# systemctl restart prometheus.service

2.5 停止node_exporter服务

root@k8s-master-01:~# systemctl stop node-exporter

2.6 Prometheus web界面

Prometheus之Alertmanager邮件报警配置_配置文件

2.7 验证Alertmanager界面

Prometheus之Alertmanager邮件报警配置_重启_02

2.8 验证邮件告警信息

Prometheus之Alertmanager邮件报警配置_重启_03

三告警模板

3.1 修改Prometheus规则文件

root@prometheus-01:~# cat /usr/local/prometheus/alert_rules/pod.yaml 
groups:
  - name: alert_pod.rules
    rules:
    - alert: pod_all_cpu_usage
      expr: (sum by(name)(rate(container_cpu_user_seconds_total{image!=""}[5m]))* 100) > 1
      for: 2m
      labels:
        serverity: critical
        service: pod
      annotations:
        description: 容器 {{$labels.name}} CPU 资源利用率大于10%, (current value is {{$value}})
        summary: dev CPU 负载告警
- alert: pod_all_memory_usage
  expr: sort_desc(avg by(name)(irate(node_memory_MemFree_bytes{name!=""}[5m]))) > 2 #内存大于2G 
  for: 2m
  labels:
    serverity: critical
    service: pod
  annotations:
    description: 容器 {{$labels.name}} 内存 资源利用率大于2G, (current value is {{$value}})
    summary: dev memory 负载告警

3.2 重启Prometheus服务

root@prometheus-01:~# systemctl restart prometheus.service

3.3 Prometheus web界面验证

Prometheus之Alertmanager邮件报警配置_配置文件_04

3.4 查看邮件告警信息

Prometheus之Alertmanager邮件报警配置_配置文件_05

Prometheus之Alertmanager邮件报警配置_配置文件_06

上一篇：kubernetes基于NFS实现nginx+tomcat动静分离

下一篇：Prometheus之Alertmanager部署

提问和评论都可以，用心的回复会被更多人看到评论

发布评论

相关文章

官方博客	全部文章	热门标签	班级博客
了解我们	网站地图	意见反馈

鸿蒙开发者社区	51CTO学堂
51CTO	软考资讯

Prometheus之Alertmanager邮件报警配置

Prometheus之Alertmanager邮件报警配置

一 Alertmanager配置

1.1 编辑Alertmanager配置文件

1.2 重启Alertmanager服务

二 Prometheus报警设置

2.1 修改Prometheus配置文件

2.2 创建告警规则文件

2.3 验证规则

2.4 重启Prometheus服务

2.5 停止node_exporter服务

2.6 Prometheus web界面

2.7 验证Alertmanager界面

2.8 验证邮件告警信息

三 告警模板

3.1 修改Prometheus规则文件

3.2 重启Prometheus服务

3.3 Prometheus web界面验证

3.4 查看邮件告警信息

51CTO博客

三告警模板