这是一个开发中遇到的一个问题,场景是单次http请求需要调用很多次上游接口,所以采用线程池并发调用。开发中没有注意,外层任务和内层任务都是用了线程池,并且使用的是同一个线程池,造成了资源竞争,外层需要内层执行结束的结果,内层需要外层释放的线程,两者相持不下,造成程序卡死。

用代码模拟个场景,同一个线程池,submit一个任务之后,又在任务中submit一个新的线程任务。这里定一个只有一个核心线程数的线程池,用于展示现象,并不特指只有单线程才会出现bug现象。

package com.example.springdemo;

import org.springframework.scheduling.concurrent.CustomizableThreadFactory;

import java.util.concurrent.*;

public class TestA {
    private static ThreadPoolExecutor es = new ThreadPoolExecutor(1, 1,
            0L, TimeUnit.MILLISECONDS,
            new LinkedBlockingQueue<Runnable>(), new CustomizableThreadFactory("test-thread-"), new ThreadPoolExecutor.CallerRunsPolicy());

    /*private static ThreadPoolExecutor esIn = new ThreadPoolExecutor(2, 2,
            0L, TimeUnit.MILLISECONDS,
            new LinkedBlockingQueue<Runnable>(), new CustomizableThreadFactory("test-thread-in-"), new ThreadPoolExecutor.CallerRunsPolicy());*/

    public static void main(String[] args) throws ExecutionException, InterruptedException {
        //商品信息
        System.out.println("商品信息查询开始");
        Future<?> fProduct = es.submit(() -> {
            //这里需要组装两部分信息 1是基本信息 2是促销信息
            //为了加快速度,促销信息用一个新线程查,基本信息用当前线程查
            Future<?> fPromo = es.submit(() -> {
                System.out.println("这句代表查询了商品促销信息接口");
            });
            System.out.println("这句代表查询了商品基本信息接口");
            try {
                fPromo.get();
                System.out.println("基本信息和促销信息都拿到了");
            } catch (InterruptedException | ExecutionException e) {
                e.printStackTrace();
            }
        });
        fProduct.get();
        System.out.println("商品信息查询结束");
    }

}

调试运行,结果是只打印了两句。基本可以证明查询促销任务没有执行(因为没有打印执行的文案)。

java 线程池死亡 java线程池卡死_java

 更加严谨,我们看下jstack里到底发生了什么

"test-thread-1" #14 prio=5 os_prio=0 tid=0x0000000020e28800 nid=0x3ea8 waiting on condition [0x000000002186e000]
    java.lang.Thread.State: WAITING (parking)
         at sun.misc.Unsafe.park(Native Method)
         - parking to wait for  <0x000000076c2e9430> (a java.util.concurrent.FutureTask)
         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
         at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429)
         at java.util.concurrent.FutureTask.get(FutureTask.java:191)
         at com.example.springdemo.TestA.lambda$main$1(TestA.java:27)
         at com.example.springdemo.TestA$$Lambda$1/407858146.run(Unknown Source)
         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
         at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:266)
         at java.util.concurrent.FutureTask.run(FutureTask.java)
         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
         at java.lang.Thread.run(Thread.java:748)"Service Thread" #13 daemon prio=9 os_prio=0 tid=0x000000001ffc6000 nid=0x2cd4 runnable [0x0000000000000000]
    java.lang.Thread.State: RUNNABLE"C1 CompilerThread3" #12 daemon prio=9 os_prio=2 tid=0x000000001ff38000 nid=0x1518 waiting on condition [0x0000000000000000]
    java.lang.Thread.State: RUNNABLE"C2 CompilerThread2" #11 daemon prio=9 os_prio=2 tid=0x000000001ff35800 nid=0x4614 waiting on condition [0x0000000000000000]
    java.lang.Thread.State: RUNNABLE"C2 CompilerThread1" #10 daemon prio=9 os_prio=2 tid=0x000000001ff1c000 nid=0x2374 waiting on condition [0x0000000000000000]
    java.lang.Thread.State: RUNNABLE"C2 CompilerThread0" #9 daemon prio=9 os_prio=2 tid=0x000000001ff19800 nid=0x33e8 waiting on condition [0x0000000000000000]
    java.lang.Thread.State: RUNNABLE"JDWP Command Reader" #8 daemon prio=10 os_prio=0 tid=0x000000001eff1000 nid=0x2338 runnable [0x0000000000000000]
    java.lang.Thread.State: RUNNABLE"JDWP Event Helper Thread" #7 daemon prio=10 os_prio=0 tid=0x000000001eff0000 nid=0x4738 runnable [0x0000000000000000]
    java.lang.Thread.State: RUNNABLE"JDWP Transport Listener: dt_socket" #6 daemon prio=10 os_prio=0 tid=0x000000001efe4800 nid=0x1848 runnable [0x0000000000000000]
    java.lang.Thread.State: RUNNABLE"Attach Listener" #5 daemon prio=5 os_prio=2 tid=0x000000001ef80800 nid=0x254 waiting on condition [0x0000000000000000]
    java.lang.Thread.State: RUNNABLE"Signal Dispatcher" #4 daemon prio=9 os_prio=2 tid=0x000000001efdb000 nid=0x301c runnable [0x0000000000000000]
    java.lang.Thread.State: RUNNABLE"Finalizer" #3 daemon prio=8 os_prio=1 tid=0x000000001d077000 nid=0x13b0 in Object.wait() [0x000000001f43e000]
    java.lang.Thread.State: WAITING (on object monitor)
         at java.lang.Object.wait(Native Method)
         - waiting on <0x000000076b388ec8> (a java.lang.ref.ReferenceQueue$Lock)
         at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143)
         - locked <0x000000076b388ec8> (a java.lang.ref.ReferenceQueue$Lock)
         at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:164)
         at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)"Reference Handler" #2 daemon prio=10 os_prio=2 tid=0x000000001d070000 nid=0x2fc4 in Object.wait() [0x000000001ef3f000]
    java.lang.Thread.State: WAITING (on object monitor)
         at java.lang.Object.wait(Native Method)
         - waiting on <0x000000076b386b68> (a java.lang.ref.Reference$Lock)
         at java.lang.Object.wait(Object.java:502)
         at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
         - locked <0x000000076b386b68> (a java.lang.ref.Reference$Lock)
         at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)"main" #1 prio=5 os_prio=0 tid=0x000000000344b000 nid=0xb84 waiting on condition [0x00000000033df000]
    java.lang.Thread.State: WAITING (parking)
         at sun.misc.Unsafe.park(Native Method)
         - parking to wait for  <0x000000076c2bdd50> (a java.util.concurrent.FutureTask)
         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
         at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429)
         at java.util.concurrent.FutureTask.get(FutureTask.java:191)
         at com.example.springdemo.TestA.main(TestA.java:33)"VM Thread" os_prio=2 tid=0x000000001d067000 nid=0x3fac runnable
"GC task thread#0 (ParallelGC)" os_prio=0 tid=0x0000000003460800 nid=0x2528 runnable
"GC task thread#1 (ParallelGC)" os_prio=0 tid=0x0000000003462000 nid=0x3f30 runnable
"GC task thread#2 (ParallelGC)" os_prio=0 tid=0x0000000003464000 nid=0x2a0 runnable
"GC task thread#3 (ParallelGC)" os_prio=0 tid=0x0000000003465800 nid=0x4ae8 runnable
"GC task thread#4 (ParallelGC)" os_prio=0 tid=0x0000000003468800 nid=0x49ac runnable
"GC task thread#5 (ParallelGC)" os_prio=0 tid=0x0000000003469800 nid=0x6a0 runnable
"GC task thread#6 (ParallelGC)" os_prio=0 tid=0x000000000346d000 nid=0x4098 runnable
"GC task thread#7 (ParallelGC)" os_prio=0 tid=0x000000000346e000 nid=0x2944 runnable
"VM Periodic Task Thread" os_prio=2 tid=0x00000000200e2000 nid=0x4304 waiting on condition
JNI global references: 3061

java 线程池死亡 java线程池卡死_开发语言_02

 

java 线程池死亡 java线程池卡死_java 线程池死亡_03

java 线程池死亡 java线程池卡死_开发语言_04

 可以看到,jstack发现两处java.lang.Thread.State: WAITING (parking),分别是线程main和test-thread-1 ,行数是33和27行。

分析一下,33行park是因为27行park,这里没法拿到返回结果,所以阻塞住了,那么27行为什么停止呢?

27行应该是22行这个任务提交了,但是线程没有执行(没有执行的println打出),所以get肯定拿不到结果,造成阻塞。

为什么提交却不执行呢?这里需要知道线程池的执行机制,提交任务时先增加worker,当到worker到核心线程数量时,就会加入队列了。。。。。(剩下的过程就不说了,解答这个问题已经够了)。

因为我们线程池es核心线程数是1,所以19行就占用了一个线程,那么22行提交的时候,因为已经到了核心线程数,那么这个任务就会加入队列,等待线程池中的线程空闲了消费队列中的任务。27行去尝试拿结果,那么这个线程就会陷入等待任务完成的状态。因为这个线程陷入等待的状态,所以外层33就拿不到执行结果,也是一直wait,一直占用着这一个线程,不会释放,27行等待的任务也不会拿到线程去执行然后返回结果,这个就造成了jstack两个等待的情况。

总体而言,造成阻塞的原因就是内层需要外层释放线程去执行,外层需要拿到内层执行的结果。外层不拿到结果就不释放线程,内层不拿到线程就得不到结果。造成了另一种形式的资源竞争死锁。但是通过jconsole等工具却检查不到,排查还是有一定难度的。

java 线程池死亡 java线程池卡死_java_05