1 Tomcat监控方式
Tomcat本身⽆法对外提供Prometheus所兼容的Metrics,因此需要借助第三⽅exporter来提供:tomcat_exporter或者jmx_exporter
2 tomcat_exporter
2.1 安装tomcat
2.1.1 解压安装
tar -xf apache-tomcat-9.0.73.tar.gz -C /app/module/
2.1.2 下载所依赖的jar 包
wget https://search.maven.org/remotecontent?filepath=io/prometheus/simpleclient/0.12.0/simpleclient-0.12.0.jar
wget https://search.maven.org/remotecontent?filepath=io/prometheus/simpleclient_common/0.12.0/simpleclient_common-0.12.0.jar
wget https://search.maven.org/remotecontent?filepath=io/prometheus/simpleclient_hotspot/0.12.0/simpleclient_hotspot-0.12.0.jar
wget https://search.maven.org/remotecontent?filepath=io/prometheus/simpleclient_servlet/0.12.0/simpleclient_servlet-0.12.0.jar
wget https://search.maven.org/remotecontent?filepath=io/prometheus/simpleclient_servlet_common/0.12.0/simpleclient_servlet_common-0.12.0.jar
wget https://search.maven.org/remotecontent?filepath=nl/nlighten/tomcat_exporter_client/0.0.15/tomcat_exporter_client-0.0.15.jar
wget https://search.maven.org/remotecontent?filepath=nl/nlighten/tomcat_exporter_servlet/0.0.15/tomcat_exporter_servlet-0.0.15.war
# 整合包的下载地址
wget https://github.com/Im-oldxu/tomcat-exporter/releases/download/tomcat_exporter-0.0.17/tomcat_exporter.tar.gz
2.1.3 拷⻉jar包和war
将jar包和war包分别拷⻉⾄对应的⽬录下
tar -xf tomcat_exporter.tar.gz
mv tomcat_exporter/*.war ../apache-tomcat-9.0.73/webapps/
mv tomcat_exporter/*.jar ../apache-tomcat-9.0.73/lib/
2.1.4 启动Tomcat
/app/module/apache-tomcat-9.0.73/bin/startup.sh
2.1.5 访问tomcat的metrics
curl http://localhost:8080/metrics/ #最后要有这个斜杠
2.2 配置Prometheus
1、编辑Prometheus配置⽂件,将Tomcat服务纳⼊监控
- job_name: "tomcat_exporter"
metrics_path: "/metrics"
static_configs:
- targets: ["192.168.137.131:8080"]
2、重新加载Prometheus配置⽂件
curl -X POST http://192.168.137.131:9090/-/reload
2.3 Tomcat常⽤指标与示例
对于Tomcat,我们通常会使⽤(Rate)、请求失败数(Errors)、请求延迟(Duration)来评估当前服务的质量。
2.3.1 Tomcat连接相关指标
指标名称 | 指标类型 | 指标含义 |
tomcat_connections_active_total | gauge | Tomcat当前的活跃连接总数 |
tomcat_connections_active_max | gauge | Tomcat最⼤活跃连接数(在 <Connector>标签中可以指定:maxConnections="200"设定最⼤连接数 ) |
案例:计算Tomcat的最⼤活动连接数的饱和度,
计算公式:当前活跃连接数 / 最⼤活跃连接数 * 100
tomcat_connections_active_total / tomcat_connections_active_max * 100
2.3.2 Tomcat请求相关指标
指标名称 | 指标类型 | 指标含义 |
tomcat_requestpro cessor_error_count _total | counter | Tomcat启动以来处理的所有请求中发⽣错误的请求总数(如客户端断开连接、服务器内部错误等) |
tomcat_requestpro cessor_request_co unt_total | counter | Tomcat启动以来处理的所有HTTP请求的总数 |
tomcat_requestprocessor_time_seconds | gauge | Tomcat服务器处理请求所花费的总时间(单位是秒)虽然显示是gauge类型指标,但它的值却是不断累加的 |
tomcat_requestpro cessor_sent_bytes | gauge | Tomcat服务器发送的总字节数 |
tomcat_requestpro cessor_received_bytes | gauge | Tomcat服务器接收的总字节数 |
案例1:计算Tomcat最近5分钟,Http请求的错误率占⽐Http请求总数的⽐率。
计算公式: 每5分钟的错误请求数 / 每5分钟的总请求数 * 100
rate(tomcat_requestprocessor_error_count_total[5m]) / rate(tomcat_requestprocessor_request_count_total[5m]) * 100
案例2:计算Tomcat最近5分钟,处理每个请求所需要花费的时间。
rate(tomcat_requestprocessor_time_seconds[5m])
2.3.3 Tomcat会话相关指标
指标名称 | 指标类型 | 指标含义 |
tomcat_session_cr eated_total | gauge | Tomcat启动以来创建的会话总数 |
tomcat_session_ac tive_total | gauge | Tomcat当前活跃的会话数 |
tomcat_session_rej ected_total | gauge | Tomcat拒绝创建的会话数。(通常可能是会话数达到了最⼤配置,从⽽拒绝创建新的会话) |
案例1:计算Tomcat创建会话的速率。
sum(rate(tomcat_session_created_total[5m])) by(instance,job,host)
案例2:计算被拒绝创建的会话占总创建会话的⽐率。
计算公式:( 拒绝的会话数 / (创建的会话数 + 拒绝会话数) * 100 )
tomcat_session_rejected_total / (tomcat_session_created_total + tomcat_session_rejected_total) * 100
2.3.4 Tomcat线程相关指标
指标名称 | 指标类型 | 指标含义 |
tomcat_threads_max | gauge | Tomcat线程池允许的最⼤线程数。(在<Connector>标签中可以指定:maxThreads="200"设定最⼤线程数 ) |
tomcat_thread s_active_total | gauge | Tomcat正在处理请求的活跃线程数 |
案例1:计算Tomcat活跃的请求线程数占总请求的线程数⽐率。
计算公式:当前活跃线程数/ 最⼤的线程数 * 100
tomcat_threads_active_total / tomcat_threads_max * 100
2.4 Tomcat告警规则⽂件
2.4.1 告警规则⽂件
vim /app/module/prometheus/rules/tomcat_rules.yml
groups:
- name: tomcat告警规则
rules:
- alert: Tomcat活跃连接数过高
expr: tomcat_connections_active_total / tomcat_connections_active_max * 100 >= 80
for: 1m
labels:
severity: warning
annotations:
summary: "Tomcat服务器活跃连接数过高, 实例:{{ $labels.instance }}"
description:
Tomcat最大连接数是 {{ printf `tomcat_connections_active_max{instance="%s",job="%s",name="%s"}` $labels.instance $labels.job $labels.name | query | first | value }}
Tomcat当前连接数是 {{ printf `tomcat_connections_active_total{instance="%s",job="%s",name="%s"}` $labels.instance $labels.job $labels.name | query | first | value }}
Tomcat活跃连接数已超过最大活跃连接数的80%, 当前值为 {{ $value }}%
- alert: Tomcat处理请求超过5秒
expr: rate(tomcat_requestprocessor_time_seconds[5m]) > 5
for: 5m
labels:
severity: warning
annotations:
summary: "Tomcat处理请求时间过长, 实例:{{ $labels.instance }}"
description: "Tomcat在过去5分钟的平均处理请求时间超过5秒,当前值 {{ $value }}。"
- alert: "Tomcat会话拒绝率超过20%"
expr: (tomcat_session_rejected_total / (tomcat_session_created_total + tomcat_session_rejected_total)) * 100 > 20
for: 5m
labels:
severity: critical
annotations:
summary: "Tomcat会话拒绝率过高, 实例:{{ $labels.instance }}"
description: "Tomcat在Host:{{ $labels.host }} 的 {{ $labels.context }} 的上下文中的会话拒绝率超过20%,当前值 {{ $value }}。"
- alert: "Tomcat线程使用率过高"
expr: (tomcat_threads_active_total / tomcat_threads_max) * 100 > 80
for: 5m
labels:
severity: warning
annotations:
summary: "Tomcat线程使用率过高, 实例:{{ $labels.instance }}"
description:
Tomcat最大线程数是 {{ printf `tomcat_threads_max{instance="%s",job="%s",name="%s"}` $labels.instance $labels.job $labels.name | query | first| value }}
Tomcat当前线程数是 {{ printf `tomcat_threads_active_total{instance="%s",job="%s",name="%s"}` $labels.instance $labels.job $labels.name | query | first | value }}
Tomcat线程数已超过最大活跃连接数的80%, 当前值为 {{ $value }}%
2.4.2 检查rules语法
/app/module/prometheus/promtool check rules /app/module/prometheus/rules/tomcat_rules.yml
2.4.3 重新加载Prometheus
curl -X POST http://192.168.137.131:9090/-/reload
2.4.4 验证告警规则
2.5 导⼊Tomcat图形
下载对应的dashboard
wget https://mirror.ghproxy.com/https://github.com/nlighten/tomcat_exporter/blob/master/dashboard/example.json
3 jmx_exporter
3.1 Tomcat添加配置
vim /app/module/apache-tomcat-9.0.73/bin/catalina.sh
#开头添加配置
JAVA_OPTS="-Xms512m -Xmx1024m -javaagent:/app/module/jmx_exporter/jmx_prometheus_javaagent-0.20.0.jar=12346:/app/module/jmx_exporter/tomcat.yml"
3.2 添加tomcat.yml
#具体模板位置(https://github.com/prometheus/jmx_exporter/blob/main/example_configs/)
vim /app/module/jmx_exporter/tomcat.yml
---
lowercaseOutputLabelNames: true
lowercaseOutputName: true
whitelistObjectNames: ["java.lang:type=OperatingSystem", "Catalina:*"]
blacklistObjectNames: []
rules:
- pattern: 'Catalina<type=Server><>serverInfo: (.+)'
name: tomcat_serverinfo
value: 1
labels:
serverInfo: "$1"
type: COUNTER
- pattern: 'Catalina<type=GlobalRequestProcessor, name=\"(\w+-\w+)-(\d+)\"><>(\w+):'
name: tomcat_$3_total
labels:
port: "$2"
protocol: "$1"
help: Tomcat global $3
type: COUNTER
- pattern: 'Catalina<j2eeType=Servlet, WebModule=//([-a-zA-Z0-9+&@#/%?=~_|!:.,;]*[-a-zA-Z0-9+&@#/%=~_|]), name=([-a-zA-Z0-9+/$%~_-|!.]*), J2EEApplication=none, J2EEServer=none><>(requestCount|processingTime|errorCount):'
name: tomcat_servlet_$3_total
labels:
module: "$1"
servlet: "$2"
help: Tomcat servlet $3 total
type: COUNTER
- pattern: 'Catalina<type=ThreadPool, name="(\w+-\w+)-(\d+)"><>(currentThreadCount|currentThreadsBusy|keepAliveCount|connectionCount|acceptCount|acceptorThreadCount|pollerThreadCount|maxThreads|minSpareThreads):'
name: tomcat_threadpool_$3
labels:
port: "$2"
protocol: "$1"
help: Tomcat threadpool $3
type: GAUGE
- pattern: 'Catalina<type=Manager, host=([-a-zA-Z0-9+&@#/%?=~_|!:.,;]*[-a-zA-Z0-9+&@#/%=~_|]), context=([-a-zA-Z0-9+/$%~_-|!.]*)><>(processingTime|sessionCounter|rejectedSessions|expiredSessions):'
name: tomcat_session_$3_total
labels:
context: "$2"
host: "$1"
help: Tomcat session $3 total
type: COUNTER
- pattern: 'java.lang<type=OperatingSystem><>(committed_virtual_memory|free_physical_memory|free_swap_space|total_physical_memory|total_swap_space)_size:'
name: os_$1_bytes
type: GAUGE
attrNameSnakeCase: true
- pattern: 'java.lang<type=OperatingSystem><>((?!process_cpu_time)\w+):'
name: os_$1
type: GAUGE
attrNameSnakeCase: true
3.3 重启tomcat
/app/module/apache-tomcat-9.0.73/bin/shutdown.sh
/app/module/apache-tomcat-9.0.73/bin/startup.sh
3.4 配置Prometheus
- job_name: "java"#job_name已经被限制死了,只能是java
metrics_path: "/metrics"
static_configs:
- targets: ["192.168.137.131:12346"]
3.5 重新加载Prometheus
curl -X POST http://192.168.137.131:9090/-/reload
3.6 导入JVM图形
导⼊⼀个JVM的Grafana模板。Dashboard ID为 8563