一. Prometheus
Prometheus受启发于Google的Brogmon监控系统(相似的Kubernetes是从Google的Brog系统演变而来),从2012年开始由前Google工程师在Soundcloud以开源软件的形式进行研发,并且于2015年早期对外发布早期版本。2016年5月继Kubernetes之后成为第二个正式加入CNCF基金会的项目,同年6月正式发布1.0版本。2017年底发布了基于全新存储层的2.0版本,能更好地与容器平台、云平台配合。
1.1 简介
- Prometheus 社区非常活跃,基本稳定在 1个月1个版本的迭代速度,从 2016 年 v1.01 开始接触使用以来,到目前发布到 v2.13.x;
- Go 语言开发,性能优越,安装部署简单,多平台部署兼容性好;
- 时序数据库,丰富的数据收集客户端,官方以及第三方提供了各种常用开源 exporter;
- 具有类似SQL的强大查询能力;
- 对云原生 Kubernetes 支持友好;
1.2 安装
# 查询镜像
[root@freedev prometheus]# docker search prometheus
# 拉取镜像,一般选择starts最多的
[root@freedev prometheus]# docker pull prom/prometheus
# 运行镜像
[root@freedev prometheus]# docker run -d --name prometheus -p 9090:9090 prom/prometheus
# 注意映射到宿主机端口是否被占用
[root@freedev prometheus]# netstat -tunlp | grep 9090
# 查看所有打开的端口,云服务器还需在安全组设置
[root@freedev prometheus]# firewall-cmd --zone=public --list-ports
# 没有的话进行添加
[root@freedev prometheus]# firewall-cmd --zone=public --add-port=9191/tcp --permanent
# 更新防火墙规则
[root@freedev prometheus]# firewall-cmd --reload
# 重新启动镜像
[root@freedev prometheus]# docker run -d --name prometheus -p 9090:9090 prom/prometheus
# 查看容器是否运行成功
[root@freedev prometheus]# docker ps
# 拷贝prometheus配置文件到宿主机(后面需要用到)
[root@freedev prometheus]# docker cp prometheus:/etc/prometheus/prometheus.yml /data/prometheus
打开网页查看安装是否成功
二. Node Exporter采集主机运行数据
2.1 简介
Node Exporter同样采用Golang编写,并且不存在任何的第三方依赖,只需要下载,解压即可运行;
2.2 安装
# 下载
[root@freedev data]# curl -OL https://github.com/prometheus/node_exporter/releases/download/v1.2.2/node_exporter-1.2.2.linux-amd64.tar.gz
# 解压
[root@freedev data]# tar -zxvf node_exporter-1.2.2.linux-amd64.tar.gz
# 封装service
[root@freedev node_exporter-1.2.2.linux-amd64]# vi /etc/systemd/system/node-exporter.service
#内容如下,自行更改User及ExecStart
[Unit]
Description=Prometheus Node Exporter
After=network.target
[Service]
ExecStart=/data/node_exporter-1.2.2.linux-amd64/node_exporter
User=morton
[Install]
WantedBy=multi-user.target
# 刷新配置及设置开机自启动
[root@freedev node_exporter-1.2.2.linux-amd64]# systemctl daemon-reload
[root@freedev node_exporter-1.2.2.linux-amd64]# systemctl enable node-exporter
[root@freedev node_exporter-1.2.2.linux-amd64]# systemctl start node-exporter
# 查看是否启动成功
[root@freedev node_exporter-1.2.2.linux-amd64]# systemctl status node-exporter
访问http://ip:9100/可以看到以下页面:
访问http://ip:9100/metrics,可以看到当前node exporter获取到的当前主机的所有监控数据
2.3 从Node Exporter收集监控数据
2.3.1 . 修改刚刚从prometheus容器中拷贝出来的配置文件prometheus.yml,在scrape_configs下添加job;
# 采集node exporter监控数据
- job_name: 'node'
static_configs:
- targets: ['ip:9100']
2.3.2. 重新启动prometheus;
# 查看启动的容器
[root@freedev prometheus]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
b06836eef793 grafana/grafana "/run.sh" 18 hours ago Up 18 hours 0.0.0.0:3000->3000/tcp grafana
e4b2f65737bd 907035594c57 "/bin/prometheus --c…" 18 hours ago Up 18 hours 0.0.0.0:9191->9090/tcp prometheus
# 停止容器
[root@freedev prometheus]# docker stop e4b2f65737bd
e4b2f65737bd
[root@freedev prometheus]# docker rm e4b2f65737bd
e4b2f65737bd
# 重新启动
[root@freedev prometheus]# docker run -d --name prometheus -p 9191:9090 -v /data/prometheus/prometheus.yml:/etc/prometheus/prometheus.yml prom/prometheus
8105a7bcbf60d68948f40d54e0cfef69160ebb57df79fcaf0d6c20d546e2341f
2.3.3访问页面
访问http://localhost:9090,进入到Prometheus Server。如果输入“up”并且点击执行按钮以后,看到下面的结果则表示成功;
其中“1”表示正常,反之“0”则为异常
三. Grafana
3.1 安装
3.1.1. 同样使用docker进行安装
# 搜索镜像
[root@freedev grafana]# docker search grafana
# 拉取镜像
[root@freedev grafana]# docker pull grafana/grafana
[root@freedev grafana]# docker run -d --name grafana -p 3000:3000 -v /data/grafana:/var/lib/grafana -v /data/grafana/log:/var/log grafana/grafana
3.1.2. 访问http://ip:3000查看页面
3.1.3. 登录
用户名密码默认 admin:admin,第一次登录后会进行密码修改, 登录后显示如下页面:
3.2 配置数据源
3.2.1. 打开设置菜单添加Prometheus作为数据源
3.2.2. 点击选择prometheus
3.2.3. 填写prometheus地址,点击保存
3.3 选择grafana面板图形化展示
3.3.1. 添加dashboard
3.3.2. 选择dashboard
(可以导入离线json或直接输入id拉取),Grafana已经提供了很多优质选择,dashboard地址
填写选择的dashboardID,点击load,这里推荐两个(11074,9276);
主机监控到此就结束,图标的编辑,查询,有兴趣的童鞋可以深入一下~;
四. 配置Springboot应用监控指标
4.1 添加依赖
- pom.xml
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<!-- 借助 Micrometer 对接 Prometheus 监控系统 -->
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-prometheus</artifactId>
<version>1.7.0</version>
</dependency>
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-core</artifactId>
<version>1.7.0</version>
</dependency>
- 打开 Prometheus 监控接口 application.yml
management:
metrics:
enable:
jvm: true
logback: true
process.files: true
process.uptime: true
process.start.time: true
system.cpu: true
process.cpu: true
tomcat: true
http: true
system: true
tags:
application: ${spring.application.name}
endpoints:
web:
exposure:
# 将 Actuator 的 /actuator/prometheus 端点暴露出来
include: '*'
- 启动应用,可以看到控制台监控接口信息
- 访问http://localhost:8888/actuator/prometheus,可以看到如下监控信息
# HELP jvm_buffer_total_capacity_bytes An estimate of the total capacity of the buffers in this pool
# TYPE jvm_buffer_total_capacity_bytes gauge
jvm_buffer_total_capacity_bytes{application="monitoring-prometheus-grafana",id="direct",} 16384.0
jvm_buffer_total_capacity_bytes{application="monitoring-prometheus-grafana",id="mapped",} 0.0
# HELP jvm_threads_live_threads The current number of live threads including both daemon and non-daemon threads
# TYPE jvm_threads_live_threads gauge
jvm_threads_live_threads{application="monitoring-prometheus-grafana",} 23.0
# HELP tomcat_global_received_bytes_total
# TYPE tomcat_global_received_bytes_total counter
tomcat_global_received_bytes_total{application="monitoring-prometheus-grafana",name="http-nio-8888",} 0.0
# HELP logback_events_total Number of error level events that made it to the logs
# TYPE logback_events_total counter
logback_events_total{application="monitoring-prometheus-grafana",level="error",} 0.0
logback_events_total{application="monitoring-prometheus-grafana",level="warn",} 0.0
logback_events_total{application="monitoring-prometheus-grafana",level="trace",} 0.0
logback_events_total{application="monitoring-prometheus-grafana",level="info",} 39.0
logback_events_total{application="monitoring-prometheus-grafana",level="debug",} 0.0
# HELP tomcat_cache_access_total
# TYPE tomcat_cache_access_total counter
tomcat_cache_access_total{application="monitoring-prometheus-grafana",} 0.0
# HELP tomcat_sessions_alive_max_seconds
# TYPE tomcat_sessions_alive_max_seconds gauge
tomcat_sessions_alive_max_seconds{application="monitoring-prometheus-grafana",} 0.0
# HELP jvm_gc_memory_promoted_bytes_total Count of positive increases in the size of the old generation memory pool before GC to after GC
# TYPE jvm_gc_memory_promoted_bytes_total counter
jvm_gc_memory_promoted_bytes_total{application="monitoring-prometheus-grafana",} 6726088.0
# HELP jvm_gc_live_data_size_bytes Size of long-lived heap memory pool after reclamation
# TYPE jvm_gc_live_data_size_bytes gauge
jvm_gc_live_data_size_bytes{application="monitoring-prometheus-grafana",} 1.6494984E7
# HELP tomcat_sessions_expired_sessions_total
# TYPE tomcat_sessions_expired_sessions_total counter
tomcat_sessions_expired_sessions_total{application="monitoring-prometheus-grafana",} 0.0
# HELP tomcat_sessions_rejected_sessions_total
# TYPE tomcat_sessions_rejected_sessions_total counter
tomcat_sessions_rejected_sessions_total{application="monitoring-prometheus-grafana",} 0.0
# HELP tomcat_threads_current_threads
# TYPE tomcat_threads_current_threads gauge
tomcat_threads_current_threads{application="monitoring-prometheus-grafana",name="http-nio-8888",} 10.0
# HELP jvm_threads_daemon_threads The current number of live daemon threads
# 省略N多内容.....
4.2 配置prometheus.yml
- job_name: 'monitoring-prometheus-grafana'
metrics_path: '/actuator/prometheus'
scrape_interval: 5s
basic_auth:
username: 'actuator'
password: 'actuator'
static_configs:
- targets: ['ip:port'] # 应用部署机器IP及端口
labels:
instance: monitoring-prometheus-grafana
4.3 配置grafana,选择dashboard
推荐(4701,12900),步骤上面已经说明,就不重复了;
4.4 监控大图
Prometheus支持自定义监控指标,感兴趣的童鞋可以深入一下~,本文就不做叙述了;