前面一篇文章讲述了oracle cpu_count过高引发了一个bug,导数oracle无法启动,详见连接

http://blog.csdn.net/xionglang7/article/details/9181791

这个是测试库的问题,但是过了两天生产库也出现同样的问题了,所以大家彻查了一下原因。

目前了解的情况是与CPU核数过高、数据库的并行参数(paraller_max_server)设置过高(256)等有关,ORACLE的算法在低版本有BUG(在cpu核数超过128、并行参数设置过高时容易触发)。

 Alert日志显示在进行一个insert语句时触发了该bug引起宕机。之后重启时由于并行恢复参数设置问题,同样触发了BUG,导致无法启动。经查询metalink,通过降低CPU核数绕过该bug,完成启动。目前metalink上未找到当前版本的补丁,后续将开展升级整改。

那么我们就来看看paraller_max_server这个东西是怎么计算的?

metlink上记录:

 

With 11.2.0.2 there is a new method to compute the default for PARALLEL_MAX_SERVERS.
 
In the Oracle Rdbms Reference Guide we find:
parallel_max_servers = PARALLEL_THREADS_PER_CPU * CPU_COUNT * concurrent_parallel_users * 5
In the formula, the value assigned to concurrent_parallel_users running at the default degree of parallelism on an 
instance is dependent on the memory management setting. 
 - If automatic memory management is disabled (manual mode), then the value of concurrent_parallel_users is 1. 
 - If PGA automatic memory management is enabled, then the value of concurrent_parallel_users is 2. 
 - If global memory management or SGA memory target is used in addition to PGA automatic memory management, 
   then the value of concurrent_parallel_users is 4.
The value is capped by processes -15 (this is true for versions prior 11.2.0.2 as well).
As example we have the following values
 
parallel_threads_per_cpu  = 2
cpu_count                 = 4
pga_aggregate_target      = 500M
sga_target                = 900M
processes                 = 150
 
parallel_max_servers = 2 * 4 * 4 * 5 = 160
parallel_max_servers = min( 150-15 , 160 ) = 135
So with these values we get a default of 135 for parallel_max_servers.
Note if the parallel_max_servers is reduced due to value of processes, then you see similar to the following in alert log (e.g. at instance start up):
Mon May 06 18:43:06 2013
Adjusting the default value of parameter parallel_max_servers
from 160 to 135 due to the value of parameter processes (150)
Starting ORACLE instance (normal) 
所以如果这个参数太高,并行的进程就太大了,导数数据库无法承受。

这个参数值可以通过下面的语句查询:

select * from dba_hist_parameter b where b.parameter_name='parallel_max_servers' order by b.snap_id desc

从上面的内容看出oracle并行进程不能设置的太高。