最近生产环境出现了一个问题,就是Job服务日志好端端的不打印日志了,服务也没有挂, 现在将此次问题解决过程记录下来~
问题描述
生产环境有一台Job服务器,是专门用来跑所有定时任务的,然后有一天发现定时任务好像没有执行,所以上Job服务器查看日志,结果发现的情况是:
最后打印的是昨天晚上九点半的,到我看的时候就一直没有日志,没有日志就没有执行Job;当时为了快速解决问题就重启了服务器,Job就正常执行了;后来第二天上去看的时候居然又停止了, 但是Job的JVM是还在的,并没有挂掉,就是没有日志,看起来就是被阻塞了一样;
这个时候又把生成环境重启让他先正常执行,然后立马去测试环境看看能不能重现, 结果测试环境的日志也是一样
问题代码
//允许异步执行 Schedule
public class TestSchedule {
private static final Logger LOGGER = LoggerFactory.getLogger(LotterySchedule.class);
// 使用线程池myAsync来执行这个Job
("myAsync")
(cron = "0/1 * * * * ?")
public void testDoGet(){
LOGGER.info("\ntestDoGet:"+Thread.currentThread());
//业务代码:里面调用了 String json = HttpUtil.doGet(url);来调用第三方接口
HttpUtil.doGet("www.baidu.com")
}
//这里没有用异步执行,单线程执行
(cron = "0/1 * * * * ?")
public void testPrint(){
LOGGER.info("\ntestPrint:"+Thread.currentThread());
}
}
然后看看线程池的代码
public class ExecutorConfig {
/** Set the ThreadPoolExecutor's core pool size. */
private int corePoolSize = 10;
/** Set the ThreadPoolExecutor's maximum pool size. */
private int maxPoolSize = 25;
/** Set the capacity for the ThreadPoolExecutor's BlockingQueue. */
private int queueCapacity = 10;
public Executor myAsync() {
ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
executor.setCorePoolSize(corePoolSize);
executor.setMaxPoolSize(maxPoolSize);
executor.setQueueCapacity(queueCapacity);
executor.setThreadNamePrefix("MyJobExecutor-");
// rejection-policy:当pool已经达到max size的时候,如何处理新任务
// CALLER_RUNS:不在新线程中执行任务,而是有调用者所在的线程来执行
executor.setRejectedExecutionHandler(new ThreadPoolExecutor.CallerRunsPolicy());
executor.initialize();
return executor;
}
}
1.查看JVM情况
Linux使用jstat命令查看jvm的GC情况
jps 查询Jvm进程号
查询Jvm jstat -gc 21738 5000
发现Jvm好像没有出现频繁GC,GC处理异常的情况,而且Jvm启动也配置了:+HeapDumpOnOutOfMemoryError;但是没有看到内存溢出的Dump文件;排除 Jvm异常的情况
2.查看线程栈分析
jps 查询Jvm进程号
jstack -l 22741 查询线程栈信息
"MyJobExecutor-2" #25 prio=5 os_prio=31 tid=0x00007fc7f7374800 nid=0xa203 waiting on condition [0x0000700004def000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x0000000742196c58> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at org.apache.http.pool.AbstractConnPool.getPoolEntryBlocking(AbstractConnPool.java:380)
at org.apache.http.pool.AbstractConnPool.access$200(AbstractConnPool.java:69)
at org.apache.http.pool.AbstractConnPool$2.get(AbstractConnPool.java:246)
- locked <0x00000007969b6910> (a org.apache.http.pool.AbstractConnPool$2)
at org.apache.http.pool.AbstractConnPool$2.get(AbstractConnPool.java:193)
at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.leaseConnection(PoolingHttpClientConnectionManager.java:303)
at org.apache.http.impl.conn.PoolingHttpClientConnectionManager$1.get(PoolingHttpClientConnectionManager.java:279)
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:191)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108)
at com.xxx.util.HttpUtil.doGet(HttpUtil.java:97)
at com.xxx.util.HttpUtil.doGet(HttpUtil.java:70)
省略
at
at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:746)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
at org.springframework.transaction.interceptor.TransactionInterceptor$$Lambda$392/226041624.proceedWithInvocation(Unknown Source)
at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:294)
at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:98)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:185)
at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:688)
at com.bfc.service.impl.LotteryHbkuai3MainServiceImpl$$EnhancerBySpringCGLIB$$4d3de61.publishPreByJob(<generated>)
at com.bfc.job.LotterySchedule.publishByJob(LotterySchedule.java:45)
at com.bfc.job.LotterySchedule$$FastClassBySpringCGLIB$$75ac0373.invoke(<generated>)
at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:746)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
at org.springframework.aop.interceptor.AsyncExecutionInterceptor.lambda$invoke$0(AsyncExecutionInterceptor.java:115)
at org.springframework.aop.interceptor.AsyncExecutionInterceptor$$Lambda$391/524684837.call(Unknown Source)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Locked ownable synchronizers:
- <0x0000000741db91a0> (a java.util.concurrent.ThreadPoolExecutor$Worker)
"MyJobExecutor-1" #24 prio=5 os_prio=31 tid=0x00007fc7f5ff7800 nid=0xa403 waiting on condition [0x0000700004cec000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x0000000742196c58> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at org.apache.http.pool.AbstractConnPool.getPoolEntryBlocking(AbstractConnPool.java:380)
at org.apache.http.pool.AbstractConnPool.access$200(AbstractConnPool.java:69)
at org.apache.http.pool.AbstractConnPool$2.get(AbstractConnPool.java:246)
- locked <0x00000007968318d8> (a org.apache.http.pool.AbstractConnPool$2)
at org.apache.http.pool.AbstractConnPool$2.get(AbstractConnPool.java:193)
at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.leaseConnection(PoolingHttpClientConnectionManager.java:303)
at org.apache.http.impl.conn.PoolingHttpClientConnectionManager$1.get(PoolingHttpClientConnectionManager.java:279)
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:191)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108)
at com.xxx.util.HttpUtil.doGet(HttpUtil.java:97)
at com.xxx.util.HttpUtil.doGet(HttpUtil.java:70)
at
省略
at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:746)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
at org.springframework.transaction.interceptor.TransactionInterceptor$$Lambda$392/226041624.proceedWithInvocation(Unknown Source)
at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:294)
at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:98)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:185)
at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:688)
at com.bfc.service.impl.LotteryHbkuai3MainServiceImpl$$EnhancerBySpringCGLIB$$4d3de61.publishPreByJob(<generated>)
at com.bfc.job.LotterySchedule.publishByJob(LotterySchedule.java:45)
at com.bfc.job.LotterySchedule$$FastClassBySpringCGLIB$$75ac0373.invoke(<generated>)
at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:746)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
at org.springframework.aop.interceptor.AsyncExecutionInterceptor.lambda$invoke$0(AsyncExecutionInterceptor.java:115)
at org.springframework.aop.interceptor.AsyncExecutionInterceptor$$Lambda$391/524684837.call(Unknown Source)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Locked ownable synchronizers:
- <0x0000000741da4a28> (a java.util.concurrent.ThreadPoolExecutor$Worker)
分析问题
通过栈信息可以发现,基本上所有的线程都被阻塞了,都在wait;
重点看 MyJobExecutor-"
开头的线程都在wait同一个lock,并且代码发生的地方是 HttpUtil.doGet
方法;
这么看来好像是HttpUtil.doGet 发生了阻塞;然后分析问题代码,结果发现这个代码是这样的
private static PoolingHttpClientConnectionManager connMgr;
private static RequestConfig requestConfig;
private static final int MAX_TIMEOUT = 7000;
private static CloseableHttpClient httpClient ;
static {
// 设置连接池
connMgr = new PoolingHttpClientConnectionManager();
// 设置连接池大小
connMgr.setMaxTotal(8);
connMgr.setDefaultMaxPerRoute(connMgr.getMaxTotal());
RequestConfig.Builder configBuilder = RequestConfig.custom();
// 设置连接超时
configBuilder.setConnectTimeout(MAX_TIMEOUT);
// 设置读取超时
configBuilder.setSocketTimeout(MAX_TIMEOUT);
// 设置从连接池获取连接实例的超时
configBuilder.setConnectionRequestTimeout(MAX_TIMEOUT);
// 在提交请求之前 测试连接是否可用
configBuilder.setStaleConnectionCheckEnabled(true);
requestConfig = configBuilder.build();
httpClient = HttpClients.custom().setConnectionManager(connMgr).build();
}
/**
* 发送 GET 请求(HTTP),不带输入数据
* @param url
* @return
*/
public static String doGet(String url) {
return doGet(url, new HashMap<String, Object>());
}
/**
* 发送 GET 请求(HTTP),K-V形式
* @param url
* @param params
* @return
*/
public static String doGet(String url, Map<String, Object> params) {
String apiUrl = url;
StringBuffer param = new StringBuffer();
int i = 0;
for (String key : params.keySet()) {
if (i == 0) {
param.append("?");
}else {
param.append("&");
}
param.append(key).append("=").append(params.get(key));
i++;
}
apiUrl += param;
String result = null;
// HttpClient httpclient = new DefaultHttpClient();
try {
HttpGet httpPost = new HttpGet(apiUrl);
HttpResponse response = httpClient.execute(httpPost);
HttpEntity entity = response.getEntity();
if (entity != null) {
InputStream instream = entity.getContent();
result = IOUtils.toString(instream, "UTF-8");
}
} catch (IOException e) {
e.printStackTrace();
}
return result;
}
}
注意看这个doGet(); 流没有关闭…
因为流没有关闭,这个HttpClient连接池的连接一直没有回收回去,后面的线程又一直在调用这个doGet方法;
但是又获取不到连接,所以就一直阻塞在哪里,直到连接超时HttpClient内部三个超时时间的区别
然后myAsync 这个线程池的线程也是有限的, Schedule每秒都在执行,很快线程不够用了,然后就阻塞了testDoGet这个定时任务了;
为了确认是 流未关闭的问题 我们可以看看服务器的TCP连接
netstat -anp | grep 进程号
可以看到有很多的80连接端口处于CLOSE_WAIT状态的; CLOSE_WAIT状态的原因与解决方法
问题的原因找到了,那么解决的方法就很简单了,把HttpClient的连接的流关闭掉就行了
HttpEntity entity = response.getEntity();
httpStr = EntityUtils.toString(entity, "UTF-8");
EntityUtils.toString方法里面有关闭流的;这样改了就没有问题了;
好像问题是解决了 但是怎么觉得哪里不对呢??
定时任务 public void testDoGet()
是用的线程池异步执行
定时任务public void testPrint()
是单线程执行的;
讲道理就算testDoGet
阻塞了,也不应该把testPrint
阻塞了吧?
再看jstack -l 信息
"pool-3-thread-1" #23 prio=5 os_prio=31 tid=0x00007ff371c86800 nid=0x5803 waiting on condition [0x0000700004776000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000007421cb160> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at org.apache.http.pool.AbstractConnPool.getPoolEntryBlocking(AbstractConnPool.java:380)
at org.apache.http.pool.AbstractConnPool.access$200(AbstractConnPool.java:69)
at org.apache.http.pool.AbstractConnPool$2.get(AbstractConnPool.java:246)
- locked <0x0000000797dbc900> (a org.apache.http.pool.AbstractConnPool$2)
at org.apache.http.pool.AbstractConnPool$2.get(AbstractConnPool.java:193)
at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.leaseConnection(PoolingHttpClientConnectionManager.java:303)
at org.apache.http.impl.conn.PoolingHttpClientConnectionManager$1.get(PoolingHttpClientConnectionManager.java:279)
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:191)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108)
at com.xxx.util.HttpUtil.doGet(HttpUtil.java:97)
at com.xxx.util.HttpUtil.doGet(HttpUtil.java:70)
at com.bfc.job.LotterySchedule$$FastClassBySpringCGLIB$$75ac0373.invoke(<generated>)
at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:746)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
at org.springframework.aop.interceptor.AsyncExecutionInterceptor.lambda$invoke$0(AsyncExecutionInterceptor.java:115)
at org.springframework.aop.interceptor.AsyncExecutionInterceptor$$Lambda$401/476029857.call(Unknown Source)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor$CallerRunsPolicy.rejectedExecution(ThreadPoolExecutor.java:2022)
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823)
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369)
at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:134)
at org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor.submit(ThreadPoolTaskExecutor.java:341)
at org.springframework.aop.interceptor.AsyncExecutionAspectSupport.doSubmit(AsyncExecutionAspectSupport.java:284)
at org.springframework.aop.interceptor.AsyncExecutionInterceptor.invoke(AsyncExecutionInterceptor.java:129)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:185)
at org.springframework.aop.framework.CglibAopProxysun.reflect.GeneratedMethodAccessor96.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.scheduling.support.ScheduledMethodRunnable.run(ScheduledMethodRunnable.java:84)
at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54)
at org.springframework.scheduling.concurrent.ReschedulingRunnable.run(ReschedulingRunnable.java:93)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
初步分析这个信息,说明 pool-3-thread-1(就是Schedule的主线程) 也去执行了我们的testDoGet
;
myAsync线程池不够用了,用了主线程来执行这个定时任务,然后又阻塞了,所以就阻塞了所有的其他定时任务
什么鬼?为什么会这样,然后翻过去看 myAsync的线程池代码
public Executor myAsync() {
ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
executor.setCorePoolSize(corePoolSize);
executor.setMaxPoolSize(maxPoolSize);
executor.setQueueCapacity(queueCapacity);
executor.setThreadNamePrefix("MyJobExecutor-");
// rejection-policy:当pool已经达到max size的时候,如何处理新任务
// CALLER_RUNS:不在新线程中执行任务,而是有调用者所在的线程来执行
executor.setRejectedExecutionHandler(new ThreadPoolExecutor.CallerRunsPolicy());
executor.initialize();
return executor;
}
用的丢弃策略是 CallerRunsPolicy
:它直接在 execute 方法的调用线程中运行被拒绝的任务;如果执行程序已关闭,则会丢弃该任务
就是说线程池的队列也满了,就会触发丢弃策略,CallerRunsPolicy 是用调用线程池的那个线程来执行;
ThreadPoolExecutor里面4种拒绝策略(
为了不让他们互相影响,改下线程池的丢弃策略就行了
Spring Boot 定时任务单线程和多线程