#(1)安装mysqld exporter , 作用: mysqld exporter的功能是收集mysql服务器的数据, 并向外提供api接口, 用于prometheus主要获取数据;

1)在被监控端mysql服务器上创建账号用于mysql exporter收集使用

GRANT REPLICATION CLIENT, PROCESS ON  *.*  to 'exporter'@'%' identified by '123456';
GRANT SELECT ON performance_schema.* TO 'exporter'@'%';
flush privileges;

2)在被监控端mysql服务器上安装mysql exporter, 这里我使用二进制方式安装

docker run -d  --restart=always  --name mysqld-exporter -p 9104:9104   -e DATA_SOURCE_NAME="user:password@(hostname:port)/"   prom/mysqld-exporter

docker run -d  --restart=always  --name mysqld-exporter -p 9104:9104   -e DATA_SOURCE_NAME="exporter:123456@(192.168.1.82:3306)/"   prom/mysqld-exporter

要查看容器是否报错, 主要是验证exporter与mysql服务端之间正常连接和获取数据; 
docker logs -f mysqld-exporter  看有没有报错

3)验证 curl http://192.168.1.62:9104/metrics 能够看到很多mysql相关数据

#(2)安装consul consul作用: 服务注册中心,向外提供服务的增删api接口, prometheus可以向consul动态获取节点信息以及自动加载配置

1)docker安装consul

 docker run  --restart=always --name consul -d -p 8500:8500 consul

2)向consul的api接口添加服务

curl -X PUT -d '{"id": "mysql62","name": "mysql62","address": "192.168.1.62","port": 9104,"tags": ["test"],"checks": [{"http": "http://192.168.1.62:9104/","interval": "5s"}]}'     http://localhost:8500/v1/agent/service/register
curl -X PUT -d '{"id": "mysql82","name": "mysql82","address": "192.168.1.82","port": 9104,"tags": ["test"],"checks": [{"http": "http://192.168.1.82:9104/","interval": "5s"}]}'     http://localhost:8500/v1/agent/service/register

consul上服务能够正常注册

#(3)安装和配置altermanger

altermanager作用: 接收prometheus发送的告警信息, 通过相关方式例如邮件和微信等方式发送给接收者; 0)准备目录

test -d /etc/alertmanager || mkdir -pv /etc/alertmanager

1)准备配置文件

# cat /etc/alertmanager/alertmanager.yml 
global:
	resolve_timeout: 5m

templates:
- '/etc/alertmanager/wechat.tmpl'

route:
	group_by: ['alertname']
	group_wait: 10s
	group_interval: 10s
	repeat_interval: 1h
	receiver: 'wechat'
receivers:
- name: 'wechat'
	wechat_configs:
	- corp_id: 'wwc08fcb42fc6fe93c'
		to_party: '2'
		agent_id: '1000002'
		api_secret: 'cLG91Xgcd3o3zPJp6NbOJV9m7SBIlhtCScxov3Hp-XQ'
		send_resolved: true

2)准备模板文件

# cat /etc/alertmanager/wechat.tmpl 
{{ define "wechat.default.message" }}
{{ if gt (len .Alerts.Firing) 0 -}}
Alerts Firing:
{{ range .Alerts }}
告警级别:{{ .Labels.severity }}
告警类型:{{ .Labels.alertname }}
故障主机: {{ .Labels.instance }}
告警主题: {{ .Annotations.summary }}
告警详情: {{ .Annotations.description }}
触发时间: {{ .StartsAt.Format "2006-01-02 15:04:05" }}
{{- end }}
{{- end }}
{{ if gt (len .Alerts.Resolved) 0 -}}
Alerts Resolved:
{{ range .Alerts }}
告警级别:{{ .Labels.severity }}
告警类型:{{ .Labels.alertname }}
故障主机: {{ .Labels.instance }}
告警主题: {{ .Annotations.summary }}
触发时间: {{ .StartsAt.Format "2006-01-02 15:04:05" }}
恢复时间: {{ .EndsAt.Format "2006-01-02 15:04:05" }}
{{- end }}
{{- end }}
告警链接:
{{ template "__alertmanagerURL" . }}
{{- end }}

3)启动容器

docker run --restart=always   -d -p 9093:9093 -v /etc/alertmanager/alertmanager.yml:/etc/alertmanager/alertmanager.yml -v /etc/alertmanager/wechat.tmpl:/etc/alertmanager/wechat.tmpl --name alertmanager prom/alertmanager

4)验证容器是否有报错

docker logs -f alertmanager 打开网页进行验证

#(4)安装和配置prometheus
prometheus作用: 用于向exporter获取数据并保存数据, 同时可以设置规则和触发器, 向报警器发送信息;

1)准备目录

test -d /etc/prometheus || mkdir /etc/prometheus -pv

2)准备prometheus配置文件

rule_files : 报警规则文件 alerting: 当触发报警, 把报警相关发送给altermanager, 由altermanager接收告警信息在发送给接收人; job_name: consul : prometheus 向consul注册;

#cat /etc/prometheus/prometheus.yml
global:
	scrape_interval:     15s
	evaluation_interval: 15s
rule_files:
	- "/etc/prometheus/*.rules"
alerting:
	alertmanagers:
	- static_configs:
		- targets:
			- "192.168.1.82:9093"
scrape_configs:
	- job_name: prometheus
		static_configs:
			- targets: ['localhost:9090']
				labels:
					instance: prometheus
	- job_name: 'consul'
		consul_sd_configs:
			- server: '192.168.1.82:8500'
				services: []

		relabel_configs:
			- source_labels: [__meta_consul_tags]
				regex: .*test.*
				action: keep

3)准备mysql告警规则文件 , 注意该文件不能有tag键, 同时key和value之间必须要有空格

#cat /etc/prometheus/prometheus.rules
groups:
- name: MySQLStatsAlert
	rules:
	- alert: MySQL is down
		expr: mysql_up == 0
		for: 1m
		labels:
			severity: critical
		annotations:
			summary: "Instance {{ $labels.instance }} MySQL is down"
			description: "MySQL database is down. This requires immediate action!"


	- alert: Mysql_High_QPS
		expr: rate(mysql_global_status_questions[5m]) > 500 
		for: 2m
		labels:
			severity: warning
		annotations:
			summary: "{{$labels.instance}}: Mysql_High_QPS detected"
			description: "{{$labels.instance}}: Mysql opreation is more than 500 per second ,(current value is: {{ $value }})"  
	- alert: Mysql_Too_Many_Connections
		expr: rate(mysql_global_status_threads_connected[5m]) > 200
		for: 2m
		labels:
			severity: warning
		annotations:
			summary: "{{$labels.instance}}: Mysql Too Many Connections detected"
			description: "{{$labels.instance}}: Mysql Connections is more than 100 per second ,(current value is: {{ $value }})"  

	- alert: Mysql_Too_Many_slow_queries
		expr: rate(mysql_global_status_slow_queries[5m]) > 3
		for: 2m
		labels:
			severity: warning
		annotations:
			summary: "{{$labels.instance}}: Mysql_Too_Many_slow_queries detected"
			description: "{{$labels.instance}}: Mysql slow_queries is more than 3 per second ,(current value is: {{ $value }})"  

	- alert: SQL thread stopped
		expr: mysql_slave_status_slave_sql_running != 1
		for: 1m
		labels:
			severity: critical
		annotations:
			summary: "Instance {{ $labels.instance }} Sync Binlog is enabled"
			description: "SQL thread has stopped. This is usually because it cannot apply a SQL statement received from the master."
	- alert: Slave lagging behind Master
		expr: rate(mysql_slave_status_seconds_behind_master[5m]) >30 
		for: 1m
		labels:
			severity: warning 
		annotations:
			summary: "Instance {{ $labels.instance }} Slave lagging behind Master"
			description: "Slave is lagging behind Master. Please check if Slave threads are running and if there are some performance issues!"

4)docker方式启动prometheus

docker run  --restart=always --name prometheus -d -p 9090:9090 -v /etc/prometheus:/etc/prometheus  prom/prometheus 

5)登录到prometheus验证

在prometheus上, 能够看到正常连接到mysqld exporter

mysql相关规则实时生效了

#(4)下载安装和配置grafana

1)下载和启动grafana

 wget https://dl.grafana.com/oss/release/grafana-6.0.2-1.x86_64.rpm
 yum  install grafana-6.0.2-1.x86_64.rpm -y 
 systemctl start grafana-server 
 systemctl enable grafana-server 
 ss -anltup |grep 3000 

2)添加图形

https://grafana.com/dashboards 搜索mysql相关dashborad ;
在import图形添加相关id 7362 7371

3)验证图形 在grafana上能够正常获取到数据;

4)验证报警: 把从库的mysql实例服务停止

在prometheus的alert界面可以看到有个告警, 处于pending状态, 当处于firing状态, 持续时间为for指定的时间, 向altermanager发送告警;

进入altermanager界面, 发现altermanager接收到prometheus发送过来的报警

5)查看微信