dbca建库报错crs-0259 以及启动集群报CRS-4124&CRS-400 处理记录

原创

Liujun_Deng 2022-07-17 14:11:28 博主文章分类：Oracle ©著作权

文章标签 oracle perl css 文章分类 Oracle 数据库

©著作权归作者所有：来自51CTO博客作者Liujun_Deng的原创作品，请联系作者获取转载授权，否则将追究法律责任

系统：CentOS 7.9

数据库：oracle 11.2.0.4

环境：asm单实例

问题描述：GI安装成功后，dbca安装数据库时报错crs-0259、prcr-1006、prcr-1071，如下图所示：

dbca建库报错crs-0259 以及启动集群报CRS-4124&CRS-400 处理记录_perl

查dbca日志：

[oracle@histest orcl]$ tail -3000 trace.log | grep PRCR
[Thread-95] [ 2022-07-17
09:27:28.345 CST ]
[HASIDBRegistrationStep.executeImpl:253]
Exception while registering with
HAS 
PRCR-1006 : Failed to add resource
ora.orcl.db for orcl
PRCR-1071 : Failed to register or
update resource ora.orcl.db

官方显示此为Bug 11886915:CRS-0259 WHEN REGISTERING THE DATABASE WITH ORACLE RESTART

解决过程：

[grid@histest ~]$ crsctl stop has
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'histest'
CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'histest'
CRS-2673: Attempting to stop 'ora.DATA.dg' on 'histest'
CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'histest' succeeded
CRS-2677: Stop of 'ora.DATA.dg' on 'histest' succeeded
CRS-2679: Attempting to clean 'ora.DATA.dg' on 'histest'
CRS-2681: Clean of 'ora.DATA.dg' on 'histest' succeeded
CRS-2673: Attempting to stop 'ora.asm' on 'histest'
CRS-2677: Stop of 'ora.asm' on 'histest' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'histest'
CRS-2677: Stop of 'ora.cssd' on 'histest' succeeded
CRS-2673: Attempting to stop 'ora.evmd' on 'histest'
CRS-2677: Stop of 'ora.evmd' on 'histest' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'histest' has completed
CRS-4133: Oracle High Availability Services has been stopped.

重启has报错.

[grid@histest ~]$ crsctl start has
CRS-4124: Oracle High Availability Services startup failed.
CRS-4000: Command Start failed, or completed with errors.
message日志：
Jul 17 09:46:08 histest su: (to grid) root on pts/1
Jul 17 09:46:08 histest dbus[980]: [system] Activating service name='org.freedesktop.problems' (using servicehelper)
Jul 17 09:46:08 histest dbus[980]: [system] Successfully activated service 'org.freedesktop.problems'
Jul 17 09:48:17 histest su: (to oracle) root on pts/2
Jul 17 09:49:54 histest root: exec /u01/app/grid/product/11.2.0/grid/perl/bin/perl -I/u01/app/grid/product/11.2.0/grid/perl/lib /u01/app/grid/product/11.2.0/grid/bin/crswrapexece.pl/u01/app/grid/product/11.2.0/grid/crs/install/s_crsconfig_histest_env.txt /u01/app/grid/product/11.2.0/grid/bin/ohasd.bin "reboot"
Jul 17 09:50:01 histest systemd: Started Session 11 of user root.

对于启动has报错CRS-4124、CRS-4000，按以下步骤排查解决.

1、查看/u01/app/grid/product/11.2.0/grid/perl/bin/perl权限

原因：在系统重启后，手动启动CRS/HAS会存在该告警(文档ID 1624661.1)，不过此处并没有重启系统，官方文档说GRID_HOME中perl执行文件的所有权由于某些原因会被修改，它应属于grid用户，而不是oracle用户.

[root@histest log]# ll /u01/app/grid/product/11.2.0/grid/perl/bin/perl
-rwxr-xr-x 1 grid oinstall 1424555 Jul 21  2011 /u01/app/grid/product/11.2.0/grid/perl/bin/perl

如上所示：perl文件属性为grid:oinstall,不过官方文档中该文件权限为700，判断不是该文件造成的此异常.

2、查网上资料，说是尝试修改/var/tmp/.oracle/npohasd文件属性解决此问题.

[root@histest .oracle]# pwd
/var/tmp/.oracle
[root@histest .oracle]# ll
total 0
prw-r--r-- 1 grid oinstall 0 Jul 17 08:51 npohasd
[root@histest .oracle]# chown -R root:oinstall npohasd
[root@histest .oracle]# ll
total 0
prw-r--r-- 1 root oinstall 0 Jul 17 08:51 npohasd

结果：修改后开启HAS依然失败，此后将该文件属性重新改为grid:oinstall.

3、查mos Doc ID 1612325.1，发现场景相似

官方文档显示此异常原因：

Permission issue
Relinked the binaries and restarted the server again so that init.ohasd came up
fine, but ohasd and other daemons wouldn't start and no sockets get
created OS start S96ohasd, it will wait for init.ohasd to write the pipe.
What happened here is init.ohasd was started, then all socket files got removed by
the manual removal, then when you start ohasd again, it will wait there since those socket files was removed manually

解决方案：

Clear all sockets under /var/tmp/.oracle or /tmp/.oracle if any and then open two terminals of the same node, where stack is not coming up.

会话1：

[root@leo ~]# /u01/app/grid/product/11.2.0/grid/bin/crsctl start has
CRS-4123: Oracle High Availability Services has been started.

会话1 start has后，立即在会话2执行以下语句，待has启动成功后，按CTRL+C终止dd命令.

[root@leo .oracle]# dd if=/var/tmp/.oracle/npohasd of=/dev/null bs=1024 count=1

查看crs stack状态：

最初状态：

[grid@leo ~]$ ps -ef|grep d.bin
grid       2080      1  0 10:32 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/ohasd.bin reboot
grid       2857   2799  0 10:49 pts/2    00:00:00 grep --color=auto d.bin

has开启后状态：

[grid@leo ~]$ ps -ef|grep d.bin
grid       2080     1  0 10:32 ?        00:00:01 /u01/app/grid/product/11.2.0/grid/bin/ohasd.bin reboot
grid       3124     1  0 11:00 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/oraagent.bin
grid       3139     1  0 11:00 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/evmd.bin
grid       3141     1  0 11:00 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/tnslsnr LISTENER -inherit
grid       3176  3139  0 11:00 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/evmlogger.bin -o /u01/app/grid/product/11.2.0/grid/evm/log/evmlogger.info -l /u01/app/grid/product/11.2.0/grid/evm/log/evmlogger.log
grid       3183     1  0 11:00 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/cssdagent
grid       3206     1  0 11:00 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/ocssd.bin
grid       3241  2799  0 11:00 pts/2    00:00:00 grep --color=auto d.bin

确认集群状态：

[grid@leo ~]$ crsctl stat res -t -init
--------------------------------------------------------------------------------
NAME           TARGET  STATE       SERVER                  STATE_DETAILS       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
               ONLINE  ONLINE      leo                                          
ora.LISTENER.lsnr
               ONLINE  ONLINE      leo                                          
ora.asm
               ONLINE  ONLINE      leo                     Started             
ora.ons
               OFFLINE OFFLINE      leo                                          
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.cssd
      1       ONLINE  ONLINE       leo                                          
ora.diskmon
      1       OFFLINE OFFLINE                          
                        
ora.evmd
      1       ONLINE  ONLINE       leo                                          
ora.orcl.db
      1       ONLINE  ONLINE       leo                      Open

集群起来后，重新注册database.

[oracle@histest ~]$ srvctl add database -d orcl -o /u01/app/oracle/product/11.2.0/db_1

此后重新dbca建库，无异常发生.

参考文档：

https://blog.csdn.net/Evils798/article/details/8692898

http://blog.itpub.net/7728585/viewspace-1806208/

上一篇：rman异机恢复后，报错ora-01017

下一篇：rman异机恢复报错RMAN-06026 RMAN-06023

提问和评论都可以，用心的回复会被更多人看到评论

发布评论

相关文章

官方博客	全部文章	热门标签	班级博客
了解我们	网站地图	意见反馈

鸿蒙开发者社区	51CTO学堂
51CTO	软考资讯