MySQL Prometheus监控炫酷部署

开源的系统监控和报警工具,监控项目的流量、内存量、负载量等实时数据。它通过直接或短时jobs中介收集监控数据,在本地存储所有收集到的数据,并且通过定义好的rules产生新的时间序列数据,或发送警报。通过其它api可以将采集到的数据可视化。使用 prometheus 监控服务器系统及 MySQL 数据库系统,基础监控,后续会持续更新!

一、基本使用Prometheus监控
1.1 安装prometheus
  • 安装路径模仿MySQL安装路径规范
  • 本次安装以 2.19.2 版本为例
[root@centos ~]# mkdir -p /data/prometheus/{base,conf,data,software}
[root@centos ~]# cd /data/prometheus/software/
[root@centos software]# wget https://github.com/prometheus/prometheus/releases/download/v2.19.2/prometheus-2.19.2.linux-amd64.tar.gz
[root@centos software]# tar xf prometheus-2.19.2.linux-amd64.tar.gz -C /data/prometheus/base
[root@centos software]# mv /data/prometheus/base/prometheus-2.19.2.linux-amd64 /data/prometheus/base/2.19.2
  • 编写一个只对 prometheus 本身监控的配置文件,也是本文测试配置文件
[root@centos ~]# vim /data/prometheus/conf/prometheus.yml
global:
  scrape_interval:     15s # 表示 prometheus 抓取指标数据的频率,默认是15s,可以覆盖这个值

  # Attach these labels to any time series or alerts when communicating with
  # external systems (federation, remote storage, Alertmanager).
  external_labels:
    monitor: 'codelab-monitor'

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'    # 定义一个任务名称

    # Override the global default and scrape targets from this job every 5 seconds.
    scrape_interval: 5s

    static_configs:
      - targets: ['localhost:9090'] # mertics 接口地址,相当于 exporter 地址与端口
  • 创建一个 prometheus.service 管理文件,由 systemd 管理,减少少量运维成本。
  • 创建 prometheus 普通用户,然后对 prometheus 相关目录进行授权。
[root@centos ~]# useradd -s /sbin/nologin prometheus -M
[root@centos ~]# chown -R prometheus.prometheus /data/prometheus/
[root@centos ~]# vim /etc/systemd/system/prometheus.service
[Unit]
Description=Prometheus
Documentation=https://prometheus.io/
After=network.target
[Service]
Type=simple
User=prometheus
ExecStart=/data/prometheus/base/2.19.2/prometheus --config.file=/data/prometheus/conf/prometheus.yml --storage.tsdb.path=/data/prometheus/data
Restart=on-failure
[Install]
WantedBy=multi-user.target
  • 启动 prometheus 服务,并查看运行状态。
[root@centos ~]# systemctl start prometheus.service
[root@centos ~]# systemctl enable prometheus.service
[root@centos ~]# systemctl status prometheus.service
● prometheus.service - Prometheus
   Loaded: loaded (/etc/systemd/system/prometheus.service; disabled; vendor preset: disabled)
   Active: active (running) since Wed 2020-07-08 08:12:16 UTC; 3s ago
     Docs: https://prometheus.io/
 Main PID: 26657 (prometheus)
   CGroup: /system.slice/prometheus.service
           └─26657 /data/prometheus/base/2.19.2/prometheus --config.file=/data/prometheus/conf/prometheus.yml --storage.tsdb.path=/data/prometheus/data

Jul 08 08:12:16 centos prometheus[26657]: level=info ts=2020-07-08T08:12:16.302Z caller=main.go:678 msg="Starting TSDB ..."
Jul 08 08:12:16 centos prometheus[26657]: level=info ts=2020-07-08T08:12:16.302Z caller=web.go:524 component=web msg="Start listening for connections" address=0.0.0.0:9090
Jul 08 08:12:16 centos prometheus[26657]: level=info ts=2020-07-08T08:12:16.315Z caller=head.go:645 component=tsdb msg="Replaying WAL and on-disk memory mappable chunks if any, this may take a while"
Jul 08 08:12:16 centos prometheus[26657]: level=info ts=2020-07-08T08:12:16.319Z caller=head.go:706 component=tsdb msg="WAL segment loaded" segment=0 maxSegment=0
Jul 08 08:12:16 centos prometheus[26657]: level=info ts=2020-07-08T08:12:16.319Z caller=head.go:709 component=tsdb msg="WAL replay completed" duration=4.127423ms
Jul 08 08:12:16 centos prometheus[26657]: level=info ts=2020-07-08T08:12:16.323Z caller=main.go:694 fs_type=XFS_SUPER_MAGIC
Jul 08 08:12:16 centos prometheus[26657]: level=info ts=2020-07-08T08:12:16.324Z caller=main.go:695 msg="TSDB started"
Jul 08 08:12:16 centos prometheus[26657]: level=info ts=2020-07-08T08:12:16.324Z caller=main.go:799 msg="Loading configuration file" filename=/data/prometheus/conf/prometheus.yml
Jul 08 08:12:16 centos prometheus[26657]: level=info ts=2020-07-08T08:12:16.343Z caller=main.go:827 msg="Completed loading of configuration file" filename=/data/prometheus/conf/prometheus.yml
Jul 08 08:12:16 centos prometheus[26657]: level=info ts=2020-07-08T08:12:16.343Z caller=main.go:646 msg="Server is ready to receive web requests."
1.2 安装node_exporter
  • 安装路径模仿MySQL安装路径规范
  • 本次安装以 1.0.1 版本为例
[root@centos ~]# mkdir -p /data/node_exporter/{base,software}
[root@centos ~]# cd /data/node_exporter/software
[root@centos software]# wget https://github.com/prometheus/node_exporter/releases/download/v1.0.1/node_exporter-1.0.1.linux-amd64.tar.gz
[root@centos software]# tar xf node_exporter-1.0.1.linux-amd64.tar.gz -C /data/node_exporter/base
[root@centos software]# mv /data/node_exporter/base/node_exporter-1.0.1.linux-amd64 /data/node_exporter/base/1.0.1
  • 创建 systemd 管理配置文件
  • 注意使用的是 prometheus 用户启动的,这个用户必须存在
[root@centos ~]# vim /etc/systemd/system/node_exporter.service
[Unit]
Description=node_exporter
After=network.target

[Service]
Type=simple
User=prometheus
ExecStart=/data/node_exporter/base/1.0.1/node_exporter
Restart=on-failure

[Install]
WantedBy=multi-user.target
  • 启动 node_exporter 服务,并查看运行状态。
[root@centos ~]# systemctl start node_exporter.service
[root@centos ~]# systemctl enable node_exporter.service
[root@centos ~]# systemctl status node_exporter.service
● node_exporter.service - node_exporter
   Loaded: loaded (/etc/systemd/system/node_exporter.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2020-07-09 02:10:46 UTC; 8s ago
 Main PID: 10241 (node_exporter)
   CGroup: /system.slice/node_exporter.service
           └─10241 /data/node_exporter/base/1.0.1/node_exporter

Jul 09 02:10:47 centos node_exporter[10241]: level=info ts=2020-07-09T02:10:47.012Z caller=node_exporter.go:112 collector=thermal_zone
Jul 09 02:10:47 centos node_exporter[10241]: level=info ts=2020-07-09T02:10:47.012Z caller=node_exporter.go:112 collector=time
Jul 09 02:10:47 centos node_exporter[10241]: level=info ts=2020-07-09T02:10:47.013Z caller=node_exporter.go:112 collector=timex
Jul 09 02:10:47 centos node_exporter[10241]: level=info ts=2020-07-09T02:10:47.013Z caller=node_exporter.go:112 collector=udp_queues
Jul 09 02:10:47 centos node_exporter[10241]: level=info ts=2020-07-09T02:10:47.013Z caller=node_exporter.go:112 collector=uname
Jul 09 02:10:47 centos node_exporter[10241]: level=info ts=2020-07-09T02:10:47.013Z caller=node_exporter.go:112 collector=vmstat
Jul 09 02:10:47 centos node_exporter[10241]: level=info ts=2020-07-09T02:10:47.013Z caller=node_exporter.go:112 collector=xfs
Jul 09 02:10:47 centos node_exporter[10241]: level=info ts=2020-07-09T02:10:47.013Z caller=node_exporter.go:112 collector=zfs
Jul 09 02:10:47 centos node_exporter[10241]: level=info ts=2020-07-09T02:10:47.013Z caller=node_exporter.go:191 msg="Listening on" address=:9100
Jul 09 02:10:47 centos node_exporter[10241]: level=info ts=2020-07-09T02:10:47.013Z caller=tls_config.go:170 msg="TLS is disabled and it cannot be enabled on the fly." http2=false
  • 验证 node_exporter 是否可以成功获取到系统参数状态值
[root@centos ~]# curl http://10.186.60.54:9100/metrics 2>/dev/null | tail
promhttp_metric_handler_errors_total{cause="encoding"} 94
promhttp_metric_handler_errors_total{cause="gathering"} 0
# HELP promhttp_metric_handler_requests_in_flight Current number of scrapes being served.
# TYPE promhttp_metric_handler_requests_in_flight gauge
promhttp_metric_handler_requests_in_flight 1
# HELP promhttp_metric_handler_requests_total Total number of scrapes by HTTP status code.
# TYPE promhttp_metric_handler_requests_total counter
promhttp_metric_handler_requests_total{code="200"} 6
promhttp_metric_handler_requests_total{code="500"} 0
promhttp_metric_handler_requests_total{code="503"} 0
1.3 修改prometheus
  • 添加一个 job_name ,然后覆盖设置下抓取监控数据的频率,然后设置 node_exporter 的地址与端口,定义一个标签名
[root@centos ~]# vim /data/prometheus/conf/prometheus.yml 
global:
  scrape_interval:     15s # 表示 prometheus 抓取指标数据的频率,默认是15s,可以覆盖这个值

  # Attach these labels to any time series or alerts when communicating with
  # external systems (federation, remote storage, Alertmanager).
  external_labels:
    monitor: 'codelab-monitor'

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'    # 定义一个任务名称

    # Override the global default and scrape targets from this job every 5 seconds.
    scrape_interval: 5s

    static_configs:
      - targets: ['localhost:9090'] # mertics 接口地址,相当于 exporter 地址与端口

  - job_name: 'node'

    # 重写全局默认值,每5秒从该任务中抓取数据
    scrape_interval: 5s

    static_configs:
      - targets: ['10.186.60.54:9100']
        labels:
          instance: system
  • 重启 prometheus
[root@centos ~]# systemctl restart prometheus.service

http://10.186.60.54:9090/targets

prometheus监控springboot项目 prometheus数据库监控_数据库

1.4 安装grafana
  • 官方下载链接:https://grafana.com/grafana/download
  • 官方帮助链接:https://grafana.com/docs/grafana/latest/getting-started/what-is-grafana/
[root@centos ~]# wget https://dl.grafana.com/oss/release/grafana-7.0.5-1.x86_64.rpm
[root@centos ~]# yum -y localinstall grafana-7.0.5-1.x86_64.rpm
  • 启动 grafana 仪表盘
[root@centos ~]# systemctl start grafana-server.service
[root@centos ~]# systemctl enable grafana-server.service
[root@centos ~]# systemctl status grafana-server.service
● grafana-server.service - Grafana instance
   Loaded: loaded (/usr/lib/systemd/system/grafana-server.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2020-07-09 03:22:27 UTC; 6s ago
     Docs: http://docs.grafana.org
 Main PID: 20169 (grafana-server)
   CGroup: /system.slice/grafana-server.service
           └─20169 /usr/sbin/grafana-server --config=/etc/grafana/grafana.ini --pidfile=/var/run/grafana/grafana-server.pid --packaging=rpm cfg:default.paths.logs=/var/log/grafana cfg:default.paths.data=/var...

Jul 09 03:22:26 centos grafana-server[20169]: t=2020-07-09T03:22:26+0000 lvl=info msg="Executing migration" logger=migrator id="add unique index user_auth_token.auth_token"
Jul 09 03:22:26 centos grafana-server[20169]: t=2020-07-09T03:22:26+0000 lvl=info msg="Executing migration" logger=migrator id="add unique index user_auth_token.prev_auth_token"
Jul 09 03:22:26 centos grafana-server[20169]: t=2020-07-09T03:22:26+0000 lvl=info msg="Executing migration" logger=migrator id="create cache_data table"
Jul 09 03:22:26 centos grafana-server[20169]: t=2020-07-09T03:22:26+0000 lvl=info msg="Executing migration" logger=migrator id="add unique index cache_data.cache_key"
Jul 09 03:22:27 centos grafana-server[20169]: t=2020-07-09T03:22:27+0000 lvl=info msg="Created default admin" logger=sqlstore user=admin
Jul 09 03:22:27 centos grafana-server[20169]: t=2020-07-09T03:22:27+0000 lvl=info msg="Starting plugin search" logger=plugins
Jul 09 03:22:27 centos grafana-server[20169]: t=2020-07-09T03:22:27+0000 lvl=info msg="Registering plugin" logger=plugins name="Direct Input"
Jul 09 03:22:27 centos grafana-server[20169]: t=2020-07-09T03:22:27+0000 lvl=info msg="External plugins directory created" logger=plugins directory=/var/lib/grafana/plugins
Jul 09 03:22:27 centos systemd[1]: Started Grafana instance.
Jul 09 03:22:27 centos grafana-server[20169]: t=2020-07-09T03:22:27+0000 lvl=info msg="HTTP Server Listen" logger=http.server address=[::]:3000 protocol=http subUrl= socket=
  • 添加数据源

http://10.186.60.54:3000

prometheus监控springboot项目 prometheus数据库监控_监控_02

  • 定义数据源名称,然后指定 Prometheus 的地址与端口,只需要修改这两个地方。

prometheus监控springboot项目 prometheus数据库监控_centos_03

  • 添加仪表盘模板
  • 在官网找到想使用的模板链接:https://grafana.com/grafana/dashboards
  • 在URL链接最后有一组数字,记录好这个数字。
  • 当前使用的模板链接:https://grafana.com/grafana/dashboards/8919

prometheus监控springboot项目 prometheus数据库监控_监控_04

  • 导入模板时,要选择数据源,有些仪表盘模板是不支持 Prometheus 数据源的。

prometheus监控springboot项目 prometheus数据库监控_mysql_05

  • 查看 dashboard 仪表盘页面

prometheus监控springboot项目 prometheus数据库监控_centos_06

1.5 安装mysql_exporter
  • 数据库内创建监控用户
mysql> grant process,select,replication client on *.* to prometheus@'%' identified by '123456';
Query OK, 0 rows affected, 1 warning (0.00 sec)
  • 安装路径模仿MySQL安装路径规范
  • 本次安装以 0.12.1 版本为例
[root@centos ~]# mkdir -p /data/mysql_exporter/{base,conf,software}
[root@centos ~]# cd /data/mysql_exporter/software
[root@centos software]# wget https://github.com/prometheus/mysqld_exporter/releases/download/v0.12.1/mysqld_exporter-0.12.1.linux-amd64.tar.gz
[root@centos software]# tar xf mysqld_exporter-0.12.1.linux-amd64.tar.gz -C /data/mysql_exporter/base
[root@centos software]# mv /data/mysql_exporter/base/mysqld_exporter-0.12.1.linux-amd64 /data/mysql_exporter/base/0.12.1
  • 编辑连接 MySQL 的配置文件
  • user:监控数据库用户名
  • password:数据库用户名密码
[root@centos ~]# vim /data/mysql_exporter/conf/mysql_exporter.cnf
[client]
user=prometheus
password=123456
  • 编写 system 管理配置文件
[root@centos ~]# vim /etc/systemd/system/mysql_exporter.service
[Unit]
Description=node_exporter
After=network.target

[Service]
Type=simple
User=prometheus
ExecStart=/data/mysql_exporter/base/0.12.1/mysqld_exporter --config.my-cnf=/data/mysql_exporter/conf/mysql_exporter.cnf
Restart=on-failure

[Install]
WantedBy=multi-user.target
  • 启动 mysql_exporter
[root@centos ~]# systemctl start mysql_exporter.service
[root@centos ~]# systemctl enable mysql_exporter.service
[root@centos ~]# systemctl status mysql_exporter.service
● mysql_exporter.service - node_exporter
   Loaded: loaded (/etc/systemd/system/mysql_exporter.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2020-07-09 06:18:56 UTC; 29s ago
 Main PID: 20383 (mysqld_exporter)
   CGroup: /system.slice/mysql_exporter.service
           └─20383 /data/mysql_exporter/base/0.12.1/mysqld_exporter --config.my-cnf=/data/mysql_exporter/conf/mysql_exporter.cnf

Jul 09 06:18:56 centos mysqld_exporter[20383]: time="2020-07-09T06:18:56Z" level=info msg="Starting mysqld_exporter (version=0.12.1, branch=HEAD, revision=48667bf7c3b438b5e93b259f3d17b70a7c9aff...porter.go:257"
Jul 09 06:18:56 centos mysqld_exporter[20383]: time="2020-07-09T06:18:56Z" level=info msg="Build context (go=go1.12.7, user=root@0b3e56a7bc0a, date=20190729-12:35:58)" source="mysqld_exporter.go:258"
Jul 09 06:18:56 centos mysqld_exporter[20383]: time="2020-07-09T06:18:56Z" level=info msg="Enabled scrapers:" source="mysqld_exporter.go:269"
Jul 09 06:18:56 centos mysqld_exporter[20383]: time="2020-07-09T06:18:56Z" level=info msg=" --collect.global_status" source="mysqld_exporter.go:273"
Jul 09 06:18:56 centos mysqld_exporter[20383]: time="2020-07-09T06:18:56Z" level=info msg=" --collect.global_variables" source="mysqld_exporter.go:273"
Jul 09 06:18:56 centos mysqld_exporter[20383]: time="2020-07-09T06:18:56Z" level=info msg=" --collect.slave_status" source="mysqld_exporter.go:273"
Jul 09 06:18:56 centos mysqld_exporter[20383]: time="2020-07-09T06:18:56Z" level=info msg=" --collect.info_schema.innodb_cmp" source="mysqld_exporter.go:273"
Jul 09 06:18:56 centos mysqld_exporter[20383]: time="2020-07-09T06:18:56Z" level=info msg=" --collect.info_schema.innodb_cmpmem" source="mysqld_exporter.go:273"
Jul 09 06:18:56 centos mysqld_exporter[20383]: time="2020-07-09T06:18:56Z" level=info msg=" --collect.info_schema.query_response_time" source="mysqld_exporter.go:273"
Jul 09 06:18:56 centos mysqld_exporter[20383]: time="2020-07-09T06:18:56Z" level=info msg="Listening on :9104" source="mysqld_exporter.go:283"
Hint: Some lines were ellipsized, use -l to show in full.
  • 验证 mysql_exporter 是否可以成功获取到 MySQL 参数状态值
[root@centos ~]# curl http://10.186.60.54:9104/metrics 2>/dev/null| tail
# TYPE process_virtual_memory_max_bytes gauge
process_virtual_memory_max_bytes -1
# HELP promhttp_metric_handler_requests_in_flight Current number of scrapes being served.
# TYPE promhttp_metric_handler_requests_in_flight gauge
promhttp_metric_handler_requests_in_flight 1
# HELP promhttp_metric_handler_requests_total Total number of scrapes by HTTP status code.
# TYPE promhttp_metric_handler_requests_total counter
promhttp_metric_handler_requests_total{code="200"} 3
promhttp_metric_handler_requests_total{code="500"} 0
promhttp_metric_handler_requests_total{code="503"} 0
1.6 修改prometheus
  • 添加一个 job_name ,然后覆盖设置下抓取监控数据的频率,然后设置 mysql_exporter 的地址与端口,定义一个标签名
[root@centos ~]# vim /data/prometheus/conf/prometheus.yml 
global:
  scrape_interval:     15s # 表示 prometheus 抓取指标数据的频率,默认是15s,可以覆盖这个值

  # Attach these labels to any time series or alerts when communicating with
  # external systems (federation, remote storage, Alertmanager).
  external_labels:
    monitor: 'codelab-monitor'

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'    # 定义一个任务名称

    # Override the global default and scrape targets from this job every 5 seconds.
    scrape_interval: 5s

    static_configs:
      - targets: ['localhost:9090'] # mertics 接口地址,相当于 exporter 地址与端口

  - job_name: 'node'

    # 重写全局默认值,每5秒从该任务中抓取数据
    scrape_interval: 5s

    static_configs:
      - targets: ['10.186.60.54:9100']
        labels:
          instance: system

  - job_name: 'mysql'

    # 重写全局默认值,每5秒从该任务中抓取数据
    scrape_interval: 5s

    static_configs:
      - targets: ['10.186.60.54:9104']
        labels:
          instance: localhost:3306
  • 重启 prometheus
[root@centos ~]# systemctl restart prometheus.service

http://10.186.60.54:9090/targets

prometheus监控springboot项目 prometheus数据库监控_mysql_07

1.7 导入模板grafana
  • 官方下载链接:https://grafana.com/grafana/download
  • 官方帮助链接:https://grafana.com/docs/grafana/latest/getting-started/what-is-grafana/
  • 当前使用的模板链接:https://grafana.com/grafana/dashboards/7362

prometheus监控springboot项目 prometheus数据库监控_centos_08

  • 查看 dashboard 仪表盘页面

prometheus监控springboot项目 prometheus数据库监控_监控_09