上一篇文章介绍了JDK自带的一些可以查看JVM各种参数的工具,但在linux服务器上没有可视化的工具使用,如果通过远程连接到linux服务器还需要启动jstatdJMX等,是存在一些安全隐患的。

但JDK自带的一些基本命令行工具虽然可以查看内容,但也不是特别方便,不能够直观的看到JVM内部的情况,于是阿里巴巴在2018年9月推出了一款Java诊断工具Arthas,它支持JDK6+,采用命令行的交互方式,可以很方便的定位和诊断线上程序运行的问题。

Arthas可以看作是对JDK工具包的一种再次封装,同时提供了更多强大且丰富的功能。

一、安装Arthas

Arthas是一个类似于SpringBoot的应用包,直接下载arthas-boot.jar然后通过java -jar命令执行就可以了

通过下面的命令下载应用包:

curl -O https://arthas.aliyun.com/arthas-boot.jar

也可以通过wget命令来下载:

# github下载arthas
wget https://alibaba.github.io/arthas/arthas-boot.jar
# 或者 Gitee 下载
wget https://arthas.gitee.io/arthas-boot.jar

现在完成后,就可以看到当前目录下有一个arthas-boot.jar的JAR包,直接运行Jar包就可以了

java -jar arthas-boot.jar

也可以通过参数-h来查看有哪些选项

[root@lizhi arthas]# java -jar arthas-boot.jar -h
[INFO] arthas-boot version: 3.6.1
Usage: arthas-boot [-h] [--target-ip <value>] [--telnet-port <value>]
       [--http-port <value>] [--session-timeout <value>] [--arthas-home <value>]
       [--use-version <value>] [--repo-mirror <value>] [--versions] [--use-http]
       [--attach-only] [-c <value>] [-f <value>] [--height <value>] [--width
       <value>] [-v] [--tunnel-server <value>] [--agent-id <value>] [--app-name
       <value>] [--username <value>] [--password <value>] [--stat-url <value>]
       [--select <value>] [--disabled-commands <value>] [pid]

Bootstrap Arthas

EXAMPLES:
  java -jar arthas-boot.jar <pid>
  java -jar arthas-boot.jar --target-ip 0.0.0.0
  java -jar arthas-boot.jar --telnet-port 9999 --http-port -1
  java -jar arthas-boot.jar --username admin --password <password>
  java -jar arthas-boot.jar --tunnel-server 'ws://192.168.10.11:7777/ws'
--app-name demoapp
  java -jar arthas-boot.jar --tunnel-server 'ws://192.168.10.11:7777/ws'
--agent-id bvDOe8XbTM2pQWjF4cfw
  java -jar arthas-boot.jar --stat-url 'http://192.168.10.11:8080/api/stat'
  java -jar arthas-boot.jar -c 'sysprop; thread' <pid>
  java -jar arthas-boot.jar -f batch.as <pid>
  java -jar arthas-boot.jar --use-version 3.6.1
  java -jar arthas-boot.jar --versions
  java -jar arthas-boot.jar --select math-game
  java -jar arthas-boot.jar --session-timeout 3600
  java -jar arthas-boot.jar --attach-only
  java -jar arthas-boot.jar --disabled-commands stop,dump

启动完成后,就会把指定服务器上所有的JVM进程全部展示出来,类似于jps命令

[root@lizhi arthas]# java -jar arthas-boot.jar 
[INFO] arthas-boot version: 3.6.1
[INFO] Found existing java process, please choose one and input the serial number of the process, eg : 1. Then hit ENTER.
* [1]: 17540 org.apache.rocketmq.namesrv.NamesrvStartup
  [2]: 1477 /usr/local/nacos/target/nacos-server.jar
  [3]: 2559202 /usr/local/sentinel/sentinel-dashboard-1.8.3.jar
  [4]: 55755 kafka.Kafka
  [5]: 1892366 org.apache.rocketmq.broker.BrokerStartup
  [6]: 56755 kafka.Kafka
  [7]: 57652 play.core.server.ProdServerStart
  [8]: 1890795 io.seata.server.Server
  [9]: 2559036 /usr/local/shop-public/shop-gateway/shop-gateway-1.0.jar
  [10]: 1890722 io.seata.server.Server
  [11]: 50591 org.apache.zookeeper.server.quorum.QuorumPeerMain

每个JVM进程前面都有一个序号,想要查看哪个JVM进程的信息,在命令行输出序号,然后回车即可,相当于把Arthas绑定到哪个JVM进程上,在这个过程中,Arthas会把所有相关的程序包全部下载下来(选择查看9号JVM进程)

9
[INFO] Start download arthas from remote server: https://arthas.aliyun.com/download/3.6.1?mirror=aliyun
[INFO] Download arthas success.
[INFO] arthas home: /root/.arthas/lib/3.6.1/arthas
[INFO] Try to attach process 2559036
[INFO] Attach process 2559036 success.
[INFO] arthas-client connect 127.0.0.1 3658
  ,---.  ,------. ,--------.,--.  ,--.  ,---.   ,---.                           
 /  O  \ |  .--. ''--.  .--'|  '--'  | /  O  \ '   .-'                          
|  .-.  ||  '--'.'   |  |   |  .--.  ||  .-.  |`.  `-.                          
|  | |  ||  |\  \    |  |   |  |  |  ||  | |  |.-'    |                         
`--' `--'`--' '--'   `--'   `--'  `--'`--' `--'`-----'                          

wiki       https://arthas.aliyun.com/doc                                        
tutorials  https://arthas.aliyun.com/doc/arthas-tutorials.html                  
version    3.6.1                                                                
main_class                                                                      
pid        2559036                                                              
time       2022-05-23 13:56:56

然后在**/root/.arthas/lib/3.6.1/arthas**目录下可以看到所有的应用包:

[root@lizhi arthas]# cd /root/.arthas/lib/3.6.1/arthas
[root@lizhi arthas]# ll
total 13608
-rw-r--r-- 1 root root     8459 May 23 13:56 arthas-agent.jar
-rw-r--r-- 1 root root   141890 May 23 13:56 arthas-boot.jar
-rw-r--r-- 1 root root   430942 May 23 13:56 arthas-client.jar
-rw-r--r-- 1 root root 13266497 May 23 13:56 arthas-core.jar
-rw-r--r-- 1 root root      531 May 23 13:56 arthas.properties
-rw-r--r-- 1 root root     5445 May 23 13:56 arthas-spy.jar
-rw-r--r-- 1 root root     3113 May 23 13:56 as.bat
-rw-r--r-- 1 root root     7744 May 23 13:56 as-service.bat
-rw-r--r-- 1 root root    33257 May 23 13:56 as.sh
drwxr-xr-x 2 root root      113 May 23 13:56 async-profiler
-rw-r--r-- 1 root root      635 May 23 13:56 install-local.sh
drwxr-xr-x 2 root root      108 May 23 13:56 lib
-rw-r--r-- 1 root root     2020 May 23 13:56 logback.xml
-rw-r--r-- 1 root root     4493 May 23 13:56 math-game.jar

下面就可以通过Arthas提供的命令来查看JVM内部的各种数据。

二、Arthas命令示例

2.1 dashboard

命令行输入dashboard可以查看整个JVM进程的运行情况,包括线程、内存、GC、运行环境等信息,会动态刷新,默认每5s刷新一次,可以按ctrl+c退出该命令。

基本信息如下:

[arthas@2559036]$ dashboard
ID      NAME                                            GROUP                   PRIORITY        STATE          %CPU            DELTA_TIME      TIME            INTERRUPTED     DAEMON          
53      sentinel-time-tick-thread                       main                    5               TIMED_WAITING  0.0             0.000           708:14.338      false           true            
20      com.alibaba.nacos.client.Worker.fixed-120.79.22 main                    5               TIMED_WAITING  0.0             0.000           115:8.497       false           true            
35      com.alibaba.nacos.client.Worker.fixed-120.79.22 main                    5               TIMED_WAITING  0.0             0.000           113:38.550      false           true            
-1      G1 Young RemSet Sampling                        -                       -1              -              0.0             0.000           45:56.088       false           true            
54      sentinel-system-status-record-task-thread-1     main                    5               TIMED_WAITING  0.0             0.000           40:10.114       false           true            
-1      VM Periodic Task Thread                         -                       -1              -              0.0             0.000           33:33.421       false           true            
-1      VM Thread                                       -                       -1              -              0.0             0.000           11:32.233       false           true            
-1      GC Thread#3                                     -                       -1              -              0.0             0.000           7:46.382        false           true            
-1      GC Thread#1                                     -                       -1              -              0.0             0.000           7:46.226        false           true            
-1      GC Thread#0                                     -                       -1              -              0.0             0.000           7:45.582        false           true            
-1      GC Thread#2                                     -                       -1              -              0.0             0.000           7:45.561        false           true            
49      sentinel-command-center-service-executor-thread main                    5               WAITING        0.0             0.000           6:54.034        false           false           
66      sentinel-command-center-service-executor-thread main                    5               WAITING        0.0             0.000           6:52.469        false           false           
65      sentinel-command-center-service-executor-thread main                    5               WAITING        0.0             0.000           6:52.454        false           false           
67      sentinel-command-center-service-executor-thread main                    5               WAITING        0.0             0.000           6:52.070        false           false           
45      com.alibaba.nacos.client.naming.updater         main                    5               WAITING        0.0             0.000           6:45.740        false           true            
55      com.alibaba.nacos.client.naming.updater         main                    5               TIMED_WAITING  0.0             0.000           6:45.310        false           true            
64      com.alibaba.nacos.naming.beat.sender            main                    5               WAITING        0.0             0.000           5:24.288        false           true            
58      com.alibaba.nacos.naming.beat.sender            main                    5               TIMED_WAITING  0.0             0.000           5:21.624        false           true            
29      sentinel-heartbeat-send-task-thread-1           main                    5               TIMED_WAITING  0.0             0.000           5:1.846         false           true            
63      sentinel-heartbeat-send-task-thread-2           main                    5               WAITING        0.0             0.000           4:4.563         false           true            
48      boundedElastic-1                                main                    5               WAITING        0.0             0.000           2:55.856        false           true            
52      boundedElastic-4                                main                    5               WAITING        0.0             0.000           2:55.706        false           true            
31      mysql-cj-abandoned-connection-cleanup           main                    5               TIMED_WAITING  0.0             0.000           2:28.868        false           true            
18      com.alibaba.nacos.client.identify.watcher.0     main                    5               TIMED_WAITING  0.0             0.000           1:54.626        false           true            
70      com.alibaba.nacos.client.Worker.longPolling.fix main                    5               WAITING        0.0             0.000           1:33.477        false           true            
28      sentinel-command-center-executor-thread-1       main                    5               RUNNABLE       0.0             0.000           1:18.159        false           false           
39      com.alibaba.nacos.naming.failover               main                    5               TIMED_WAITING  0.0             0.000           1:9.303         false           true            
46      Nacos-Watch-Task-Scheduler-1                    main                    5               TIMED_WAITING  0.0             0.000           1:7.794         false           false           
Memory                                   used          total         max          usage         GC                                                                                             
heap                                     201M          512M          512M         39.33%        gc.g1_young_generation.count                    3722                                           
g1_eden_space                            170M          322M          -1           52.80%        gc.g1_young_generation.time(ms)                 7174                                           
g1_old_gen                               30M           189M          512M         5.93%         gc.g1_old_generation.count                      1                                              
g1_survivor_space                        1M            1M            -1           100.00%       gc.g1_old_generation.time(ms)                   87                                             
nonheap                                  120M          126M          -1           95.43%                                                                                                       
codeheap_'non-nmethods'                  1M            2M            5M           23.89%                                                                                                       
metaspace                                74M           75M           -1           97.54%                                                                                                       
codeheap_'profiled_nmethods'             22M           24M           117M         19.38%                                                                                                       
compressed_class_space                   8M            9M            1024M        0.86%                                                                                                        
codeheap_'non-profiled_nmethods'         13M           13M           117M         11.50%                                                                                                       
mapped                                   0K            0K            -            0.00%                                                                                                        
direct                                   80M           80M           -            100.00%                                                                                                      
Runtime                                                                                                                                                                                        
os.name                                                                                         Linux                                                                                          
os.version                                                                                      4.18.0-240.22.1.el8_3.x86_64                                                                   
java.version                                                                                    11.0.13                                                                                        
java.home                                                                                       /usr/lib/jvm/java-11-openjdk-11.0.13.0.8-4.el8_5.x86_64                                        
systemload.average                                                                              0.02                                                                                           
processors                                                                                      4                                                                                              
timestamp/uptime                                                                                Mon May 23 14:01:51 CST 2022/7850706s

2.2 thread

输入thread可以查看所有线程的情况

[arthas@2559036]$ thread
Threads Total: 85, NEW: 0, RUNNABLE: 17, BLOCKED: 0, WAITING: 33, TIMED_WAITING: 21, TERMINATED: 0, Internal threads: 14                                                                       
ID      NAME                                            GROUP                   PRIORITY        STATE          %CPU            DELTA_TIME      TIME            INTERRUPTED     DAEMON          
53      sentinel-time-tick-thread                       main                    5               TIMED_WAITING  0.56            0.001           708:16.553      false           true            
58      com.alibaba.nacos.naming.beat.sender            main                    5               TIMED_WAITING  0.19            0.000           5:21.661        false           true            
61869   arthas-command-execute                          system                  5               RUNNABLE       0.18            0.000           0:0.004         false           true            
35      com.alibaba.nacos.client.Worker.fixed-120.79.22 main                    5               TIMED_WAITING  0.1             0.000           113:38.912      false           t

后面加上线程对应id可以查看线程目前的状态

[arthas@2559036]$ thread 29
"sentinel-heartbeat-send-task-thread-1" Id=29 TIMED_WAITING on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@3f5c9b9f
    at java.base@11.0.13/jdk.internal.misc.Unsafe.park(Native Method)
    -  waiting on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@3f5c9b9f
    at java.base@11.0.13/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:234)
    at java.base@11.0.13/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2123)
    at java.base@11.0.13/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1182)
    at java.base@11.0.13/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:899)
    at java.base@11.0.13/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1054)
    at java.base@11.0.13/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1114)
    at java.base@11.0.13/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base@11.0.13/java.lang.Thread.run(Thread.java:829)

通过thread -h可以查看更过的选项

除了上面这些命令外,Arthas还提供了反编译、线上数据监测(Debug)等强大的功能,更多参数可以参考Arthas的官方文档

三、GC日志解析

对于Java应用,可以通过一些配置把程序运行过程中的GC日志全部打印出来,然后分析GC日志得到关键性指标,分析GC原因,调优JVM参数。

打印GC日志方法,在JVM参数里增加参数,%t 代表时间

-Xloggc:./gc-%t.log -XX:+PrintGCDetails -XX:+PrintGCDateStamps  -XX:+PrintGCTimeStamps -XX:+PrintGCCause  
-XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=100M

Xloggc:日志文件位置

PrintGCDetails:打印GC的详细信息

PrintGCDateStamps:打印GC发生的时间

PrintGCTimeStamps:打印GC的耗时

PrintGCCause:打印导致GC的原因

UseGCLogFileRotation:GC日志文件轮流写,比如配置了10个GC文件,当这些GC文件都写满后,覆盖第一个GC文件继续写

NumberOfGCLogFiles:GC日志文件的个数

GCLogFileSize:GC日志文件大小。

注:上面这些参数都是在JDK1.8中,在JDK11中这些参数大部分都被移除了

下面以一条具体的GC日志来分析:

2022-05-23T16:23:08.413+0800: 40.485: [GC (Allocation Failure) 2022-05-23T16:23:08.413+0800: 40.485: [DefNew: 8640K->960K(8640K), 0.0127017 secs]2022-05-23T16:23:08.425+0800: 40.498: [Tenured
: 21595K->19990K(21632K), 0.0550609 secs] 23962K->19990K(30272K), [Metaspace: 30628K->30628K(1077248K)], 0.0679058 secs] [Times: user=0.06 sys=0.01, real=0.07 secs]

该条日志记录了发生Full GC的相关信息

  • 40.485
    这是JVM进程从启动开始到这次GC经过的时间,前面是具体的发生GC的日期
  • GC (Allocation Failure)
    表示发生了一次GC,括号里面时发生GC的原因,DefNew表示新生代的GC,Tenured表示老年代的GC,Metaspace表示元空间的GC
  • 8640K->960K(8640K), 0.0127017 secs
    8640K表示GC前的内存占用,960K表示GC后的内存占用,8640K表示总共的内存占用,0.0127017表示GC占用的时间
  • user=0.06 sys=0.01, real=0.07 secs
    user表示GC线程的运行时间,sys表示STW的时间,real表示总的GC的总耗时

如果使用CMSG1,日志信息也差不太多,下面演示一下CMS的日志,启动参数需要加上-XX:+UseParNewGC -XX:+UseConcMarkSweepGC

阿里java性能优化小册 阿里jvm调优工具_java

虽然从GC的日志文件也可以比较清晰的看到每次GC发生的具体信息,但对于线上环境的GC日志,内容是非常多的,分析起来就变得非常麻烦了,可以借助一些GC日志的分析工具,比如gceasy,导入GC日志文件后,可以直观的看到GC的原因,经过分析更有利于调整JVM的参数。

下面两个参数可以打印JVM的所有参数信息:

  • -XX:+PrintFlagsInitial 表示打印出所有参数选项的默认值
  • -XX:+PrintFlagsFinal 表示打印出所有参数选项在运行程序时生效的值