出现javacore怎么分析 javacore怎么分析DeadlockDetector_堆栈

现在 IT 界普遍高并发, 分布式环境, 难免遇到死锁, 死循环等问题, 平时开发我们一般都可以停掉服务, 然后打 trace ---> 编译 ---> 修改源代码 ---> 重新编译 ---> ..... ---> 解决问题, 或者通过集成开发环境(如: IDEA, Eclipse)提供的 Debug 功能打断点 ---> watch variable ---> 进入/跳出方法 ---> .....---> 解决问题 , 但是, 当遇到生产环境却往往不知所措, 因为生产环境是不能随便让你停止服务的, 并且生产环境是不会开放 debug 端口的(debug 需要开放调试<>), 那么, 线上死锁如何定位? 怎么---> 解决问题 ?

实际上, Java 已经为我们提供了一些好用的定位分析工具, 当然, 这是一句废话, 你肯定知道, 你还知道哪些命令在 JDK 的安装目录下的 bin 下和 java 以及 javac 在一块躺着. 是的, 的确如此, 你可真是一个小机灵鬼哦(奈何从来没用过是不是? 今天帅帅给你一个应用场景, 哈哈....)

出现javacore怎么分析 javacore怎么分析DeadlockDetector_出现javacore怎么分析_02

01

jps

1.1 jps 虚拟机进程状况工具

要定位分析 java 死锁, 死循环你得先知道发生问题的 LVMID(Logical Volume Manager ID), 就好比你要看 Windows 上的一个应用的详细信息, 你得先找到这个应用运行在哪个进程上一样. 或者你可以将 LVMID 理解为 PID, 通常 LVMID 和 进程 PID 是一致的, 但是并不非得一样.

因此, 要想知道 LVMID, 可以通过查询 PID, 但是其一这是不保险的, 其二, 当同时启动了多个虚拟机进程时, 无法通过名称区分, 就必须得使用 jps 了(任务管理器里面有 n 个 java.exe).

JackLi:~ dreamli$ jps
1461 
1847 GradleDaemon
1210 
1851 RunnerApplication
1852 Jps
JackLi:~ dreamli$

上面是一个简单的使用, 第一列的数字就是 LVMID, 后面跟的是启动的主类. 帅帅使用的 gradle 构建工具, 所以有一个 GradleDaemon, 然后jfoa 项目的启动入口在 RunnerApplication

1.2 jps 参数

jps 有以下几个参数

  • -q 只输出 LVMID
JackLi:~ dreamli$ jps -q
1875
1461
1847
1210
1851
  • -m 输出传给 main 函数的参数
JackLi:~ dreamli$ jps -m
1876 Jps -m
1461 
1847 GradleDaemon 6.2.2
1210 
1851 RunnerApplication
  • -l 输出主类的全名, 如果进程执行的是 jar, 则会输出 jar 路径
JackLi:~ dreamli$ jps -l
1461 
1877 sun.tools.jps.Jps
1847 org.gradle.launcher.daemon.bootstrap.GradleDaemon
1210 
1851 club.javafamily.runner.RunnerApplication
  • -v 输出虚拟机进程启动时 JVM 参数
JackLi:~ dreamli$ jps -v
1461  -Dosgi.requiredJavaVersion=1.8 -Dosgi.instance.area.default=@user.home/eclipse-workspace -XX:+UseG1GC -XX:+UseStringDeduplication -XstartOnFirstThread -Dorg.eclipse.swt.internal.carbon.smallFonts -Dosgi.requiredJavaVersion=1.8 -Xms256m -Xmx1024m -Xdock:icon=../Resources/Eclipse.icns -XstartOnFirstThread -Dorg.eclipse.swt.internal.carbon.smallFonts -javaagent:/Applications/Eclipse.app/Contents/Eclipse/lombok.jar
1878 Jps -Dapplication.home=/Users/dreamli/Software/java/zulu-jfx-252/zulu-8.jdk/Contents/Home -Xms8m
1847 GradleDaemon -Xmx2g -Dfile.encoding=UTF-8 -Duser.country=CN -Duser.language=zh -Duser.variant
1210  -Xms128m -Xmx2048m -XX:ReservedCodeCacheSize=240m -XX:+UseCompressedOops -Dfile.encoding=UTF-8 -XX:+UseConcMarkSweepGC -XX:SoftRefLRUPolicyMSPerMB=50 -ea -Dsun.io.useCanonCaches=false -Djava.net.preferIPv4Stack=true -Djdk.http.auth.tunneling.disabledSchemes="" -XX:+HeapDumpOnOutOfMemoryError -XX:-OmitStackTraceInFastThrow -Xverify:none -XX:ErrorFile=/Users/dreamli/java_error_in_idea_%p.log -XX:HeapDumpPath=/Users/dreamli/java_error_in_idea.hprof -javaagent:/Applications/IntelliJ IDEA.app/Contents/bin/jetbrains-agent.jar -Djb.vmOptionsFile=/Users/dreamli/Library/Preferences/IntelliJIdea2019.3/idea.vmoptions -Didea.home.path=/Applications/IntelliJ IDEA.app/Contents -Didea.executable=idea -Didea.paths.selector=IntelliJIdea2019.3
1851 RunnerApplication -Djava.rmi.server.hostname=127.0.0.1 -Dspring.profiles.active=dev -agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=5005 -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/Users/dreamli/Workspace/MyRepository/javafamily/jfoa/runner/build/server-temp -Duser.country=CN -Duser.language=zh -Duser.variant

02

jstack

2.1 jstack: java 堆栈跟踪工具

jstack 命令用于生成当前时刻线程快照(一般称为 threaddump 文件或者 javacore 文件), 线程快照就是当前虚拟机内每一条线程正在执行的方法堆栈集合, 生成线程快照的目的是定位线程出现长时间停顿的原因, 如: 线程死锁, 死循环, 请求外部资源的长时间等待等都是导致线程长时间停顿的常见原因. 基本命令格式如下: jstack [options] lvmid

JackLi:~ dreamli$ jps -m
1461 
1847 GradleDaemon 6.2.2
1210 
1851 RunnerApplication
1980 Jps -m
JackLi:~ dreamli$ jstack 1851
2020-11-26 23:59:27
Full thread dump OpenJDK 64-Bit Server VM (25.252-b14 mixed mode):

"Attach Listener" #62 daemon prio=9 os_prio=31 tid=0x00007faece823800 nid=0xac03 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"MessageBroker-2" #61 prio=5 os_prio=31 tid=0x00007faed3018800 nid=0x15403 waiting on condition [0x000070000ab6c000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x0000000773124fa8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)

2.2 jstack 参数

jstack 有以下几个参数:

  • -F: 当正常输出的请求不被响应时, 强制输出线程堆栈
  • -l: 除堆栈外显示关于锁的信息
  • -m: 调用本地方法的话, 可以显示 C/C++ 的堆栈

03

死锁定位分析

昨晚代码撸到半夜, 提交了很多代码, 没想到......死锁------留下了心疼自己的泪水...

3.1 jps 定位 lvmid

第一步不用说了, 先找到 lvmid

3.2 jstack 获取线程堆栈

3.1 提点 Linux 小知识

  • 将线程堆栈写入本地文件
jstack -m 1851 >> ....您的路径../jfoa/jstack-2020-11-26.log
  • 获取远程 Linux 上的堆栈文件
scp root@javafamily.club:/root/jfoa-logs/jstack-2020-11-26.log ./jstack.txt

3.2 线程堆栈文件分析

这里贴出来一些核心的地方

"http-nio-80-exec-6" #58 daemon prio=5 os_prio=0 tid=0x00007efc15ccb800 nid=0x40 waiting on condition [0x00007efbdd490000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000000e051e308> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
    at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
    at org.apache.tomcat.util.threads.TaskQueue.take(TaskQueue.java:107)
    at org.apache.tomcat.util.threads.TaskQueue.take(TaskQueue.java:33)
    at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
    at java.lang.Thread.run(Thread.java:745)

   Locked ownable synchronizers:
    - None



"jfoa-1" #78 prio=5 os_prio=0 tid=0x00000000009ef000 nid=0x52 waiting on condition [0x00007efbda87f000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000000e0363ae0> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
    at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
    at club.javafamily.runner.web.portal.service.SubjectVoteService.getVoteCurrentCount(SubjectVoteService.java:159)
    at club.javafamily.runner.web.portal.service.SubjectVoteService.changeVote(SubjectVoteService.java:93)
    at club.javafamily.runner.web.portal.service.SubjectVoteService$$FastClassBySpringCGLIB$$c43301b2.invoke()
    at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218)
    at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:749)
    at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
    at org.springframework.aop.interceptor.AsyncExecutionInterceptor.lambda$invoke$0(AsyncExecutionInterceptor.java:115)
    at org.springframework.aop.interceptor.AsyncExecutionInterceptor$$Lambda$848/1976926828.call(Unknown Source)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
   Locked ownable synchronizers:
    - <0x00000000dd01f1a0> (a java.util.concurrent.ThreadPoolExecutor$Worker)
"https-jsse-nio-443-exec-5" #45 daemon prio=5 os_prio=0 tid=0x00007efc15a17800 nid=0x33 waiting on condition [0x00007efbde19a000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000000e0363ae0> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
    at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
    at club.javafamily.runner.web.portal.service.SubjectVoteService.getVoteCurrentCount(SubjectVoteService.java:159)
    at club.javafamily.runner.web.portal.service.SubjectVoteService$$FastClassBySpringCGLIB$$c43301b2.invoke()
    at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218)
    ...
       Locked ownable synchronizers:
    - <0x00000000e0520d38> (a java.util.concurrent.ThreadPoolExecutor$Worker)
"https-jsse-nio-443-exec-10" #50 daemon prio=5 os_prio=0 tid=0x00007efc15c2d800 nid=0x38 waiting on condition [0x00007efbddc94000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000000e0363ae0> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283)
    at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727)
    at club.javafamily.runner.web.portal.service.SubjectVoteService.getSubjectVoteDto(SubjectVoteService.java:61)
    at club.javafamily.runner.web.portal.service.SubjectVoteService$$FastClassBySpringCGLIB$$c43301b2.invoke()
    at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218)
   Locked ownable synchronizers:
    - <0x00000000e051fc38> (a java.util.concurrent.ThreadPoolExecutor$Worker)

这里主要分析 Locked ownable synchronizers 以及 "jfoa-1" #78 prio=5 os_prio=0 tid=0x00000000009ef000 nid=0x52 waiting on condition [0x00007efbda87f000]

  • Locked ownable synchronizers 显示了锁信息
  • "jfoa-1" #78 prio=5 os_prio=0 tid=0x00000000009ef000 nid=0x52 waiting on condition [0x00007efbda87f000] 如下
"线程名" #78 prio=5 os_prio=0 tid=0x00000000009ef000 nid=0x52 waiting on condition [等待锁状态, 0(0x0000000000000000) 代表没锁]
   java.lang.Thread.State: 线程状态

然后结合方法调用堆栈分析死锁, 死循环的原因.

04

问题重现

帅帅的问题最终定位到原因是 ReentrantReadWriteLock 在读锁中获取写锁.(在写锁中获取读锁不会死锁)

  • Service
package org.jack.thread01;

import java.util.concurrent.locks.ReadWriteLock;
import java.util.concurrent.locks.ReentrantReadWriteLock;

/**
* @Description: 在读锁中获取写锁
* @Warning: 死锁
* @Author: Jack Li
* @Package: thread-01 - ThreadService
* @Date: Nov 26, 2020 10:57:49 PM
* @Version: 1.0.0
* @TimeComplexity: Required[O(n)] ---- Current[O(n^2)]
* @ExecuteResult: Success!
* @Status: Accepted
 */
public class ThreadService {

    public void method1() {

        System.out.println("Method 1 准备获取读锁");
        lock.readLock().lock();

        System.out.println("Method 1 得到读锁");

        try {
            Thread.sleep(2000);

            System.out.println("Method 1 准备获取写锁");

            method3();

            System.out.println("Method 1 获得写锁并释放");


        } 
        catch(Exception e) {
            e.printStackTrace();
        }
        finally {
            lock.readLock().unlock();

            System.out.println("Method 1 释放读锁");

        }
    }

    public void method2() {
        System.out.println("Method 2 准备获取写锁");
        lock.writeLock().lock();

        System.out.println("Method 2 获得写锁");

        try {
            Thread.sleep(1000);
        } 
        catch(Exception e) {
            e.printStackTrace();
        }
        finally {
            lock.writeLock().unlock();

            System.out.println("Method 2 释放写锁");
        }
    }

    public void method3() {
        lock.writeLock().lock();

        try {
            Thread.sleep(500);
        } 
        catch(Exception e) {
            e.printStackTrace();
        }
        finally {
            lock.writeLock().unlock();
        }
    }

    private final ReadWriteLock lock = new ReentrantReadWriteLock();
}
  • 程序入口
public class DealLockTest {
    public static void main(String[] args) {

        new Thread(() -> {
            threadService.method1();
        }) .start();

        new Thread(() -> {
            threadService.method2();
        }) .start();

    }

    private static ThreadService threadService = new ThreadService();
//    private static ThreadService2 threadService = new ThreadService2();
}
  • 线程堆栈
"Thread-1" #11 prio=5 os_prio=31 tid=0x00007fbda01c2000 nid=0xa903 waiting on condition [0x0000700003e8c000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x000000076abbc1b8> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
    at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
    at org.jack.thread01.ThreadService.method2(ThreadService.java:50)
    at org.jack.thread01.DealLockTest.lambda$1(DealLockTest.java:11)
    at org.jack.thread01.DealLockTest$$Lambda$2/295530567.run(Unknown Source)
    at java.lang.Thread.run(Thread.java:748)

   Locked ownable synchronizers:
    - None

"Thread-0" #10 prio=5 os_prio=31 tid=0x00007fbda01be800 nid=0x5503 waiting on condition [0x0000700003d89000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x000000076abbc1b8> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
    at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
    at org.jack.thread01.ThreadService.method3(ThreadService.java:68)
    at org.jack.thread01.ThreadService.method1(ThreadService.java:31)
    at org.jack.thread01.DealLockTest.lambda$0(DealLockTest.java:7)
    at org.jack.thread01.DealLockTest$$Lambda$1/250421012.run(Unknown Source)
    at java.lang.Thread.run(Thread.java:748)

   Locked ownable synchronizers:
    - None

Thread-0 先获取了读锁, 然后在读锁还没释放时又去获取写锁, 获取写锁时又要等待所有的读锁释放.从而导致了死锁.