在一个测试数据库上创建表空间出现了ORA-569错误。由于环境比较复杂,首先简单描述一下数据库环境信息。这个测试环境安装的是Oracle 1106 for Solaris 10 sparc 64bit的RAC环境,使用ASM作为共享数据文件的存储机制。
在RAC环境的一个节点上还建立了一个单实例的数据库,并把这个数据库的数据文件也放到了ASM实例上。
在这个单实例数据库上添加新的表空间时报错,代码如下:
SQL> select file_namefrom dba_data_files;
FILE_NAME
------------------------------------------------------------------------------
+DATA/test/datafile/system.533.668281219
+DATA/test/datafile/sysaux.534.668281227
+DATA/test/datafile/undotbs1.535.668281229
+DATA/test/datafile/users.537.668281241
SQL> create tablespacetest datafile '+DATA/test/datafile/test01.dbf' size 4096m;
create tablespace testdatafile '+DATA/test/datafile/test01.dbf' size 4096m
*
第 1 行出现错误:
ORA-01119: 创建数据库文件'+DATA/test/datafile/test01.dbf' 时出错
ORA-17502: ksfdcre: 4未能创建文件+DATA/test/datafile/test01.dbf
ORA-00569: Failed toacquire global enqueue.
ORA-00569: Failed to acquire global enqueue.
这个错误很少见,查了一下Oracle的官方报错文档的描述:
ORA-00569: Failed toacquire global enqueue.
Cause: A prior erroroccurred on one of the instances in the cluster. Typically errors are caused byshared pool resource contention.
Action: Check for and resolve prior errorson all instances in the cluster. If there is shared
pool resource contention, increase the SHARED_POOL_SIZE,DML_ LOCKS, PROCESSES, TRANSACTIONS, CLUSTER_DATABASE_INSTANCES and PARALLEL_MAX_SERVERSinitialization parameters.
文档虽然对问题进行了描述,不过从对错误信息和问题描述中看不出导致问题的真正原因。
查询了一下Metalink,从中找到了一些关于ORA-569的错误说明。不过这些问题都和当前错误有很大的不同:大部分出现这个错误的同时都会伴随ORA-600错误和ORA-4031错误。
看来借助Metalink解决这个问题的希望落空,只能自己想办法了。
前面已经提到当前环境比较复杂,这个节点上启动了两个实例。一个是单实例的数据库,另一个是RAC数据库中的一个节点,而且两个数据库的数据文件都存储在相同的ASM磁盘组中。
问题都有两面性,环境复杂也有复杂的好处,现在有一个简单的方法可以确定到底是数据库产生的问题还是ASM实例导致的问题:只需要登录RAC实例,执行类似的添加表空间的操作,就可检查是否会出现相同的错误。
bash-3.00$ export ORACLE_SID=ractest1
bash-3.00$ sqlplus "/as sysdba"
SQL*Plus: Release 11.1.0.6.0- Production on 星期三 2月 18 17:12:422009
Copyright (c) 1982, 2007,Oracle. All rights reserved.
已连接到空闲例程。
SQL> startup
ORACLE 例程已经启动。
Total System Global Area 1603887104 bytes
Fixed Size 2095208 bytes
Variable Size 741722008 bytes
Database Buffers 855638016 bytes
Redo Buffers 4431872 bytes
数据库装载完毕。
数据库已经打开。
SQL> CREATE TABLESPACETEST DATAFILE '+DATA/ractest/datafile/test01.dbf' SIZE 4096M;
CREATE TABLESPACE TESTDATAFILE '+DATA/ractest/datafile/test01.dbf' SIZE 4096M
*
第 1 行出现错误:
ORA-01119: 创建数据库文件'+DATA/ractest/datafile/test01.dbf' 时出错
ORA-17502: ksfdcre: 4未能创建文件+DATA/ractest/datafile/test01.dbf
ORA-00569: Failed toacquire global enqueue.
ORA-00569: Failed to acquire global enqueue.
相同的错误产生了,说明问题多半与ASM实例的状态有关系。登录ASM实例进行简单的检查:
bash-3.00$ export ORACLE_SID=+ASM1
bash-3.00$ sqlplus "/as sysdba"
SQL*Plus: Release 11.1.0.6.0- Production on 星期三 2月 18 17:33:122009
Copyright (c) 1982, 2007,Oracle. All rights reserved.
SQL> set pages 100lines 120
SQL> select instance_name,status from v$instance;
INSTANCE_NAME STATUS
-------------------------------- ------------------------
+ASM1 STARTED
由于ASM实例可以用来检查的动态视图太少了,从现有的视图也看不到什么特别的地方。实在没有什么太好的查错办法,只能重启数据库和ASM实例,再次检查问题:
bash-3.00$ export ORACLE_SID=test
bash-3.00$ sqlplus "/as sysdba"
SQL> shutdown immediate
SQL> exit
bash-3.00$ export ORACLE_SID=ractest1
SQL> shutdown immediate
SQL> exit
bash-3.00$ export ORACLE_SID=+ASM1
bash-3.00$ sqlplus "/as sysdba"
SQL> shutdown immediate
^CORA-01013: user requestedcancel of current operation
SQL> CONN / AS SYSDBA
已连接。
SQL> shutdown abort
ASM 实例已关闭
SQL> startup
ASM 实例已启动
Total System Global Area 284008448 bytes
Fixed Size 2087944 bytes
Variable Size 256754680 bytes
ASM Cache 25165824 bytes
ASM diskgroups mounted
bash-3.00$ export ORACLE_SID=test
bash-3.00$ sqlplus "/as sysdba"
SQL> startup
ORACLE 例程已经启动。
数据库装载完毕。
数据库已经打开。
SQL> create tablespacetest datafile '+DATA/test/datafile/test01.dbf' size 4096m;
create tablespace testdatafile '+DATA/test/datafile/test01.dbf' size 4096m
*
第 1 行出现错误:
ORA-01119: 创建数据库文件'+DATA/test/datafile/test01.dbf' 时出错
ORA-17502: ksfdcre: 4未能创建文件+DATA/test/datafile/test01.dbf
ORA-00569: Failed toacquire global enqueue.
ORA-00569: Failed to acquire global enqueue.
可以看到重启ASM实例后问题仍然出现。不过ASM实例也是在两个节点上同时运行的,莫非是在另一个节点的ASM实例出现了问题:
bash-3.00$ export ORACLE_SID=+ASM2
bash-3.00$ sqlplus "/as sysdba"
SQL*Plus: Release 11.1.0.6.0- Production on 星期四 2月 19 16:38:382009
Copyright (c) 1982, 2007,Oracle. All rights reserved.
SQL> set pages 100lines 120
SQL> select instance_name,status from v$instance;
INSTANCE_NAME STATUS
-------------------------------- ------------------------
+ASM2 STARTED
检查ASM实例未发现异常,尝试重启ASM实例:
bash-3.00$ srvctl stop instance-d ractest -i ractest2
bash-3.00$ srvctl stop asm -n ser2
bash-3.00$ srvctl start asm -n ser2
再次登录test数据库,执行CREATE TABLESPACE语句:
bash-3.00$ export ORACLE_SID=test
bash-3.00$ sqlplus "/as sysdba"
SQL> set pages 100 lines 120
SQL> create tablespace test datafile '+DATA/test/datafile/test01.dbf' size 4096m;
表空间已创建。
看来问题果然和ASM实例状态不正常有关。
检查ASM实例2的alert文件,发现在运行CREATE TABLESPACE语句对应的时间点出现了ORA-4031错误:
Wed Feb 18 15:50:04 2009
Errors in file /data/oracle/diag/asm/+asm/+ASM2/trace/+ASM2_lmd0_3099.trc (incident=2412):
ORA-04031: unable toallocate 3512 bytes of shared memory ("shared pool","unknown object","sgaheap(1,0)","ges enqueues")
Incident details in:/data/oracle/diag/asm/+asm/+ASM2/incident/incdir_2412/+ASM2_lmd0_3099_i2412.trc
Wed Feb 18 15:50:05 2009
Trace dumping is performingid=[cdmp_20090218155005]
WARNING: ran out of sharedpool for GES enqueue object.
Errors in file /data/oracle/diag/asm/+asm/+ASM2/trace/+ASM2_lmd0_3099.trc (incident=2413):
ORA-04031: unable toallocate 3512 bytes of shared memory ("shared pool","unknown object","sgaheap(1,0)","ges enqueues")
Incident details in:/data/oracle/diag/asm/+asm/+ASM2/incident/incdir_2413/+ASM2_lmd0_3099_i2413.trc
Trace dumping is performingid=[cdmp_20090218155013]
Wed Feb 18 17:17:23 2009
Errors in file /data/oracle/diag/asm/+asm/+ASM2/trace/+ASM2_lmd0_3099.trc (incident=2414):
ORA-04031: unable toallocate 3512 bytes of shared memory ("shared pool","unknown object","sgaheap(1,0)","ges enqueues")
Incident details in:/data/oracle/diag/asm/+asm/+ASM2/incident/incdir_2414/+ASM2_lmd0_3099_i2414.trc
WARNING: ran out of sharedpool for GES enqueue object.
Wed Feb 18 17:17:24 2009
Trace dumping is performingid=[cdmp_20090218171724]
Errors in file /data/oracle/diag/asm/+asm/+ASM2/trace/+ASM2_lmd0_3099.trc (incident=2415):
ORA-04031: unable toallocate 3512 bytes of shared memory ("shared pool","unknown object","sgaheap(1,0)","ges enqueues")
Incident details in:/data/oracle/diag/asm/+asm/+ASM2/incident/incdir_2415/+ASM2_lmd0_3099_i2415.trc
Trace dumping is performingid=[cdmp_20090218171732]
Wed Feb 18 17:18:02 2009
Errors in file /data/oracle/diag/asm/+asm/+ASM2/trace/+ASM2_lmd0_3099.trc (incident=2416):
ORA-04031: unable toallocate 3512 bytes of shared memory ("shared pool","unknown object","sgaheap(1,0)","ges enqueues")
Incident details in:/data/oracle/diag/asm/+asm/+ASM2/incident/incdir_2416/+ASM2_lmd0_3099_i2416.trc
WARNING: ran out of sharedpool for GES enqueue object.
Wed Feb 18 17:18:03 2009
Trace dumping is performingid=[cdmp_20090218171803]
Errors in file /data/oracle/diag/asm/+asm/+ASM2/trace/+ASM2_lmd0_3099.trc (incident=2417):
ORA-04031: unable toallocate 3512 bytes of shared memory ("shared pool","unknown object","sgaheap(1,0)","ges enqueues")
Incident details in:/data/oracle/diag/asm/+asm/+ASM2/incident/incdir_2417/+ASM2_lmd0_3099_i2417.trc
Trace dumping is performing id=[cdmp_20090218171811]
前面ORA-569报错再加上这个ORA-4031报错,现在已经和Metalink里面的问题描述一致了,而且这个ORA-4031报错信息也很明显,问题在于分配全局队列资源时出现了错误。
检查ASM实例的sga,发现:
SQL> show sga
Total System Global Area 284008448 bytes
Fixed Size 2087944 bytes
Variable Size 256754680 bytes
ASM Cache 25165824 bytes
对于同时支持多个数据库实例的ASM实例而言,200MB的SGA显然太小了,和大部分其他Oracle默认参数一样,默认的ASM实例参数也是偏小的。
前面的文中还描述过:由于ASM实例的PROCESS参数太小导致ASM实例无法登录的问题。因此要选择ASM作为产品数据库的存储方式,就要求ASM实例在建立时就要仔细地设置:很多的默认参数须要调整后才能满足正式环境的需要。
本篇摘取自《Oracle DBA手记》第02篇 诊断案例篇 - ASM案例分析与诊断(作者:杨廷琨)