一天下午接到应用管理员打电话说IC卡AP02主机居然登录不上去了,但是ping能通,业务居然也正常,于是急忙跑到ECC机房检查果然如此,经过检查和测试问题不光在ssh命令上,大部分32位程序在执行的时候都会报错,如resize等。

处理过程:

使用ps -efM 筛选出64bit进程, 在ps -ef 显示出全部,在对比出哪些是32bit程序,(因为业务是java 64bit的所幸没有影响O(∩_∩)O!)

于是想到了是不是为AIX系统bug,报给IBM驻场经查确定为BUG (如果哪位管理的系统还有这个版本的一定要主机升级了撒,重要的事情标红,就不说三遍了

目前版本为6100-05-02,在6100-05-03上解决

BUG信息如下:                                                                                    

IZ83852: SYMBOL RESOLUTION ERROR AND NOT ENOUGH MEMORY FOR THE PROCESS APPLIES TO AIX 6100-05

A fix is available

Obtain the fix for this APAR.

APAR status

Closed as program error.

Error description

32 bit User processes start failing with the message

below while trying to exec and load.


exec(): 0509-036 Cannot load program /bin/ps because of

the following

errors:

        0509-130 Symbol resolution failed for

/usr/lib/libwlm.a(shr.o)

because:

        0509-161 There is not enough memory for the  process.

        0509-161 There is not enough memory for the  process.

        0509-161 There is not enough memory for the  process.

        0509-161 There is not enough memory for the  process.

        0509-161 There is not enough memory for the  process.

        0509-161 There is not enough memory for the  process.

        0509-161 There is not enough memory for the  process.

        0509-161 There is not enough memory for the  process.

        0509-161 There is not enough memory for the  process.

        0509-161 There is not enough memory for the  process.

        0509-161 There is not enough memory for the  process.

        0509-161 There is not enough memory for the  process.

        0509-161 There is not enough memory for the  process.

        0509-161 There is not enough memory for the  process.

        0509-161 There is not enough memory for the  process.

        0509-161 There is not enough memory for the  process.

        0509-161 There is not enough memory for the  process.

        0509-026 System error: Error 0

exec(): 0509-036 Cannot load program lslpp because of the following

errors:

        0509-130 Symbol resolution failed for /usr/lib/libinstall.a(shr.o) because:

        0509-160 There is not enough kernel memory. Try again later.

        0509-026 System error: Error 0

Local fix

Use named shlib area feature.

Problem summary

Possible memory leak more likely with WPARs.

Problem conclusion

Fix logic error that prevented memory from being freed.

Temporary fix

Comments

6100-02 - use AIX APAR IZ79109

6100-03 - use AIX APAR IZ83884

6100-04 - use AIX APAR IZ83931

6100-05 - use AIX APAR IZ83852

6100-06 - use AIX APAR IZ80674

APAR Information

APAR numberIZ83852

Reported component nameAIX 610 STD EDI

Reported component ID5765G6200

Reported release610

StatusCLOSED PER

PENoPE

HIPERNoHIPER

Submitted date2010-08-30

Closed date2010-08-30

Last modified date2013-03-28

APAR is sysrouted FROM one or more of the following:IZ79109

APAR is sysrouted TO one or more of the following:

Fixed component nameAIX 610 STD EDI

Fixed component ID5765G6200

Applicable component levels

R610 PSY U837117   UP10/10/15 I 1000

PTF to Fileset Mapping

U837117 bos.mp64 6.1.5.4