问题现象:
1.2016-4-14 11:09 左右生产环境有个java进程异常终止,终止时进程的CPU,内存,都没有异常情况。
问题定位:
1.异常停止后,查看message日志发现, 进程coredump,并生成了coredump文件,由于操作系统设置了不生成core文件,导致生成dump文件被删除。
Messge日志:
Apr 14 11:09:19 abrt[41030]: Saved core dump of pid 29781 (/usr/java/jdk1.6.0_38/bin/java) to/var/spool/abrt/ccpp-2016-04-14-11:09:07-29781 (3443929088 bytes)
Apr 14 11:09:19 abrtd: Directory'ccpp-2016-04-14-11:09:07-29781' creation detectedApr 14 11:09:20 abrtd: Deleting problem directory '/var/spool/abrt/ccpp-2016-04-14-11:09:07-29781'
系统Coredump设置:
ulimit -a
core filesize (blocks, -c) 0
2. 在web容器的日志目录找到了hs_err_pid29781.log 文件,该文件是jvm crash时生成的,通过分析crash日志发现,JVM在调用JNI本地库libjvm.so, HeapObjectDumper::do_object,创建堆转存储时,访问到了非法内存地址, JVM发生crash,从而导致java进程coredump。该时间点运维组正在zabbix监控服务器上点击,生成dump文件的操作。正好踩到了这个坑。
问题结论:
1. 该问题是JDK的一个BUG.
http://bugs.java.com/bugdatabase/view_bug.do?bug_id=7110428 官方bug报告。
https://access.redhat.com/solutions/754783 redhat网站,该问题描述。
2. 生产环境使用的是1.6的JDK版本。
问题解决:
1.后续可以换成JDK1.6最新版本,或者升级到JDK1.8。
下面的jvm crash文件分析:
下面是部分hs_err_pid29781.log
# A fatal error hasbeen detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00007f91a6d00daa, pid=29781,tid=140263367194368 // SIGSEGV unix信号量11, 大多数情况是非法访问了内存或者是访问了无效指针造成的。进程ID : 29781
# JRE version:6.0_38-b05
# Java VM: JavaHotSpot(TM) 64-Bit Server VM (20.13-b02 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# V [libjvm.so+0x496daa] HeapObjectDumper::do_object(oopDesc*)+0x4a //Crash的时候,JVM正在从哪个库文件执行代码: 调用执行libjvm.so, HeapObjectDumper::do_object函数,V(VMframe)
#
# If you would liketo submit a bug report, please visit:
# http://java.sun.com/webapps/bugreport/crash.jsp
#
--------------- T H R E A D ---------------
Current thread(0x00007f91a0074800): VMThread [stack:0x00007f919c1d3000,0x00007f919c2d4000] [id=29791] //VMThread jvm的内部线程
siginfo:si_signo=SIGSEGV:si_errno=0, si_code=1 (SEGV_MAPERR), si_addr=0x0000000800000008
0x00007f91a6d00dba: 0f 85 b0 00 00 00 48 8b 35 19 93 73 00 80 3e 00
Register to memorymapping:
Stack:[0x00007f919c1d3000,0x00007f919c2d4000], sp=0x00007f919c2d2860, free space=1022k
Native frames:(J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V [libjvm.so+0x496daa] HeapObjectDumper::do_object(oopDesc*)+0x4a
V [libjvm.so+0x6e329f] MutableSpace::object_iterate(ObjectClosure*)+0x2f
V [libjvm.so+0x768071] PSYoungGen::object_iterate(ObjectClosure*)+0x21
V [libjvm.so+0x726c3e] ParallelScavengeHeap::object_iterate(ObjectClosure*)+0x1e
V [libjvm.so+0x7275fd] ParallelScavengeHeap::safe_object_iterate(ObjectClosure*)+0xd
V [libjvm.so+0x497b91] VM_HeapDumper::doit()+0x1d1
V [libjvm.so+0x87099a] VM_Operation::evaluate()+0x4a
V [libjvm.so+0x86ff62] VMThread::evaluate_operation(VM_Operation*)+0x82
V [libjvm.so+0x8701d8] VMThread::loop()+0x198
V [libjvm.so+0x86fcde] VMThread::run()+0x6e
V [libjvm.so+0x712e5f] java_start(Thread*)+0x13f
VM_Operation(0x00007f90956ac880): HeapDumper, mode: safepoint, requested by thread0x00007f916401d800