OSwatcher作为Oracle官方推荐的OS层面运行状态检测的脚本工具。在Exadata是默认已经安装。

但是Exadata是如何在系统启动后,自动启动OSwatcher呢?我们如何去修改OSwatcher的参数,来调整监控和日志保存的策略呢?

     本文正是介绍,从系统启动到OSwatcher运行,中间经历过的脚本调用,以及如何修改OSwatcher参数的。

1. 首先检查rc.local文件,可以发现/etc/rc.d/rc.Oracle.Exadata

# vi /etc/rc.d/rc.local
-----------------------------------------
########### BEGIN DO NOT REMOVE Added by Oracle Exadata ###########
if [ -x /etc/rc.d/rc.Oracle.Exadata ]; then
  . /etc/rc.d/rc.Oracle.Exadata <<<<<<<<<<<<<<<<<<<<<<This script will be run automaticlly, when the OS starts
fi
########### END DO NOT REMOVE Added by Oracle Exadata ###########
-----------------------------------------

2. 查看rc.Oracle.Exadata,找到/opt/oracle.cellos/vldrun -all

# vi /etc/rc.d/rc.Oracle.Exadata
-----------------------------------------
# Perform validations step
/opt/oracle.cellos/vldrun -all <<<<<<<<<<<<<<<<<<<<<<This script will be run automaticlly, when the OS starts
-----------------------------------------

3. 检查当前OSwatcher的设定,每15秒收集一次,生成的日志保存168小时(7天),bzip2的压缩模式,最大日志尺寸是3G

# ps -ef | grep OSW
root 15962 1 0 04:00 pts/1 00:00:00 /bin/ksh ./OSWatcher.sh 15 168 bzip2 3
root 15994 15962 0 04:00 pts/1 00:00:00 /bin/ksh ./OSWatcherFM.sh 168 3
root 16272 9529 0 04:00 pts/1 00:00:00 grep OSW

4. 脚本/opt/oracle.cellos/vldrun会调用oswatcher脚本来启动oswatcher

# ls -al /opt/oracle.cellos/validations/init.d/oswatcher
-r-xr-x--- 1 root root 5128 Aug 19 03:39 oswatcher
# chmod 750 oswatcher <<<<<<<<<<<<<<<<<<<<<<<<Change the right, then we can edit it as per our expected.
# ls -al oswatcher
-rwxr-x--- 1 root root 5128 Aug 19 03:39 oswatcher

5. 检查当前oswatcher脚本中的设定,并修改(本次修改将原有的最大保存3G的日志,修改为最大日志尺寸为4G)

# vi oswatcher
----------------------------------------
fi
(umask 0037; nohup ./startOSW.sh 15 168 bzip2 4 >/var/log/cellos/start_oswatcher.log 2>&1 &)& <<<<<<<<<<<<change this part of the script, will let the script run as per our expected.
# Dont direct logs to startosw.log. It grows too large and fast
# (nohup ./startOSW.sh 15 168 bzip2 3 >/dev/null 2>&1 &)&
popd >/dev/null
----------------------------------------

6. 停止oswatcher

#/opt/oracle.oswatcher/osw/stopOSW.sh
# ps -ef | grep OSW
root 10528 9529 0 03:59 pts/1 00:00:00 grep OSW
        
7. 手动启动oswatcher
# /opt/oracle.cellos/vldrun -script oswatcher
Logging started to /var/log/cellos/validations.log
Command line is ./validations/bin/vldrun.pl -quiet -script oswatcher
Run validation oswatcher - PASSED
The each boot completed with SUCCESS

8.再次检查OSwatcher的设定,最大日志尺寸已经改为4G

# ps -ef|grep OSW
root 109041 1 0 03:37 pts/1 00:00:00 /bin/ksh ./OSWatcher.sh 15 168 bzip2 4
root 109073 109041 0 03:37 pts/1 00:00:00 /bin/ksh ./OSWatcherFM.sh 168 4
root 109348 59019 0 03:37 pts/1 00:00:00 grep OSW

更对oswatcher细节,可以参考以下官方文档(注:目前文档1585389.1中已经记录如何永久修改exadata上osw的设定, 但是本人测试的时候,该文档并没有相关记录)

OSWatcher does not Retain Data for the Full Retention Period Specified (Doc ID 1585389.1)

OSWatcher Black Box (Includes: [Video]) (Doc ID 301137.1)