我们在创建Oracle DB的时候,如果是在Linux系统的时候,会进行一些相关参数的设置。如shmmax,所在位置/proc/sys/kernel/shmmax,通常在/etc/sysctl.conf中配置这个参数值。
shmmax内核参数定义的是系统允许的单个共享内存段的最大值。做个试验:
kernel.shmmax = 17179869180 [oracle@021Y-SH-BKAP ~]$ ipcs -sa ------ Shared Memory Segments -------- key shmid owner perms bytes nattch status 0xfc8fb410 17498130 oracle 660 4096 0 ------ Semaphore Arrays -------- key semid owner perms nsems 0xbdab17f8 14581763 oracle 660 154 ------ Message Queues -------- key msqid owner perms used-bytes messages
我尝试修改shmmax大小:
kernel.shmmax = 8589934592 sysctl -p 重启数据库,通过ipcs命令查看,内存被分成多个共享段来满足SGA的设置需求。 [oracle@021Y-SH-BKAP ~]$ ipcs -sa ------ Shared Memory Segments -------- key shmid owner perms bytes nattch status 0x00000000 17563666 oracle 660 4096 0 0x00000000 17596456 oracle 660 4096 0 0x00000000 17629225 oracle 660 4096 0 0x00000000 17661994 oracle 660 4096 0 0x00000000 17727531 oracle 660 4096 0 0x00000000 17760300 oracle 660 4096 0 0x00000000 17793069 oracle 660 4096 0 0x00000000 17858606 oracle 660 4096 0 0x00000000 17891375 oracle 660 4096 0 0x00000000 17924144 oracle 660 4096 0 0x00000000 17989681 oracle 660 4096 0 0x00000000 18022450 oracle 660 4096 0 0x00000000 18055219 oracle 660 4096 0 0x00000000 18186292 oracle 660 4096 0 0x00000000 18219061 oracle 660 4096 0 0xfc8fb410 19267638 oracle 660 4096 0 ------ Semaphore Arrays -------- key semid owner perms nsems 0xbdab17f8 15237123 oracle 660 154 ------ Message Queues -------- key msqid owner perms used-bytes messages
在这里我尝试过将shmmax分配的非常小的情况,甚至影响到了使用sqlplus连接实例,也就是说sqlplus连接实例也需要占用一定的内存
ERROR: ORA-12547: TNS:lost contact Process J000 died, see its trace file kkjcre1p: unable to spawn jobq slave process Errors in file /DBBK/oracle/diag/rdbms/orcl/orcl/trace/orcl_cjq0_32725.trc: *** 2015-02-20 14:04:49.793 Process J000 is dead (pid=832 req_ver=6 cur_ver=6 state=KSOSP_SPAWNED).
告警提示J000进程已经挂掉了,逐步放大shmmax,虽然数据库可以起来,但观察告警日志,仍有报错:
Errors in file /DBBK/oracle/diag/rdbms/orcl/orcl/trace/orcl_j000_4353.trc (incident=119768): ORA-07445: exception encountered: core dump [lxgcvp()+5237] [SIGSEGV] [ADDR:0x7F74B376A000] [PC:0x84FB395] [Invalid permissions for mapped object] [] Incident details in: /DBBK/oracle/diag/rdbms/orcl/orcl/incident/incdir_119768/orcl_j000_4353_i119768.trc Exception [type: SIGSEGV, SI_KERNEL(general_protection)] [ADDR:0x0] [PC:0x70F26DE, kohdmpins()+78] [flags: 0x0, count: 2] Errors in file /DBBK/oracle/diag/rdbms/orcl/orcl/trace/orcl_j000_4353.trc (incident=119769): ORA-07445: exception encountered: core dump [kohdmpins()+78] [SIGSEGV] [ADDR:0x0] [PC:0x70F26DE] [SI_KERNEL(general_protection)] [] ORA-07445: exception encountered: core dump [lxgcvp()+5237] [SIGSEGV] [ADDR:0x7F74B376A000] [PC:0x84FB395] [Invalid permissions for mapped object] [] Incident details in: /DBBK/oracle/diag/rdbms/orcl/orcl/incident/incdir_119769/orcl_j000_4353_i119769.trcErrors in file /DBBK/oracle/diag/rdbms/orcl/orcl/trace/orcl_j000_2179.trc ----- Error Stack Dump ----- ----- Current SQL Statement for this session (sql_id=4jrfrtx4u6zcx) ---- 这边报错语句姑且不看,但通过网上了解Oracle有一个叫Error Stack的诊断,类似于10046,用法另开篇介绍。 这里的报错我打算也放到后续再分析。
针对某个进程还可以通过pmap系统命令查看该进程分配在哪些共享内存段中:
[oracle@021Y-SH-BKAP ~]$ ps -ef | grep dbw oracle 2779 1 0 14:20 ? 00:00:00 ora_dbw0_orcl [oracle@021Y-SH-BKAP ~]$ pmap 2779 2779: ora_dbw0_orcl 0000000000400000 155144K r-x-- /DBBK/oracle/product/11.2.0.1.0/bin/oracle 0000000009d81000 12400K rwx-- /DBBK/oracle/product/11.2.0.1.0/bin/oracle 000000000a99d000 284K rwx-- [ anon ] 000000000bfee000 536K rwx-- [ anon ] 0000000060000000 4K r-xs- /dev/shm/ora_orcl_19333174_0 0000000060001000 65532K rwxs- /dev/shm/ora_orcl_19333174_0 0000000064000000 65536K rwxs- /dev/shm/ora_orcl_19333174_1 0000000068000000 65536K rwxs- /dev/shm/ora_orcl_19333174_2 000000006c000000 65536K rwxs- /dev/shm/ora_orcl_19333174_3 0000000070000000 65536K rwxs- /dev/shm/ora_orcl_19333174_4 0000000074000000 65536K rwxs- /dev/shm/ora_orcl_19333174_5 0000000078000000 65536K rwxs- /dev/shm/ora_orcl_19333174_6 000000007c000000 65536K rwxs- /dev/shm/ora_orcl_19333174_7 0000000080000000 65536K rwxs- /dev/shm/ora_orcl_19333174_8 0000000084000000 65536K rwxs- /dev/shm/ora_orcl_19333174_9 0000000088000000 65536K rwxs- /dev/shm/ora_orcl_19333174_10 000000008c000000 65536K rwxs- /dev/shm/ora_orcl_19333174_11 0000000090000000 65536K rwxs- /dev/shm/ora_orcl_19333174_12 0000000094000000 65536K rwxs- /dev/shm/ora_orcl_19333174_13 0000000098000000 65536K rwxs- /dev/shm/ora_orcl_19333174_14 000000009c000000 65536K rwxs- /dev/shm/ora_orcl_19333174_15 00000000a0000000 65536K rwxs- /dev/shm/ora_orcl_19333174_16 00000000a4000000 65536K rwxs- /dev/shm/ora_orcl_19333174_17 00000000a8000000 65536K rwxs- /dev/shm/ora_orcl_19333174_18
把shmmax加大到足够在一个内存段中就容纳下SGA时,上述的问题将不在遇到。不过我这里遇到一个无法释放内存段的情况(我已经关闭数据库了。但内存段还是没有释放)
通过命令ipcrm手动释放,如果有多个可以通过脚本释放:
for i in `ipcs -sa | grep -v shmid | awk '{print $2}'` ; do ipcrm -m $i ; done