现像:莫名的重起

查看日志:



Aug  2 20:26:25 localhost kernel: EDAC MC1: 26 CE memory read error on CPU_SrcID#1_Ha#0_Chan#2_DIMM#0 (channel:2 slot:0 page:0x3bc505 offset:0x9c0 grain:32 syndrome:0x0 -  OVERFLOW area:DRAM err_code:0001:0092 socket:1 ha:0 channel_mask:4 rank:0)
Aug 2 20:26:25 localhost kernel: EDAC MC1: 29 CE memory read error on CPU_SrcID#1_Ha#0_Chan#2_DIMM#0 (channel:2 slot:0 page:0x3ba088 offset:0x5c0 grain:32 syndrome:0x0 - OVERFLOW area:DRAM err_code:0001:0092 socket:1 ha:0 channel_mask:4 rank:0)


 

edac-utils安装命令

yum install -y libsysfs edac-utils

检测结果,有55个错误



[root@localhost ~]#edac-util -v
mc0: 0 Uncorrected Errors with no DIMM info
mc0: 0 Corrected Errors with no DIMM info
mc0: csrow0: 0 Uncorrected Errors
mc0: csrow0: CPU_SrcID#0_Ha#0_Chan#1_DIMM#0: 0 Corrected Errors
mc0: csrow0: CPU_SrcID#0_Ha#0_Chan#2_DIMM#0: 0 Corrected Errors
mc0: csrow0: CPU_SrcID#0_Ha#0_Chan#3_DIMM#0: 0 Corrected Errors
mc1: 0 Uncorrected Errors with no DIMM info
mc1: 0 Corrected Errors with no DIMM info
mc1: csrow0: 0 Uncorrected Errors
mc1: csrow0: CPU_SrcID#1_Ha#0_Chan#0_DIMM#0: 0 Corrected Errors
mc1: csrow0: CPU_SrcID#1_Ha#0_Chan#1_DIMM#0: 0 Corrected Errors
mc1: csrow0: CPU_SrcID#1_Ha#0_Chan#2_DIMM#0: 55 Corrected Errors
mc1: csrow0: CPU_SrcID#1_Ha#0_Chan#3_DIMM#0: 0 Corrected Errors
[root@localhost ~]#


 

服务器面板报错:

惠普ProLiant DL380p Gen8服务器内存故障处理_服务器

 

 

拔掉服务器上的1巢上面的内存,再次开机启动,问题解决

惠普ProLiant DL380p Gen8服务器内存故障处理_服务器_02

 

 

再用软件进行测试,工作正常

 

惠普ProLiant DL380p Gen8服务器内存故障处理_服务器_03