RAC重新编译has遇到libskgxns.a不存在错误
Feb032015
客户的一个跑在AIX 6上的11.2.0.1.0版本的两节点RAC,在巡检时发现节点2的ASM没有启动,巡检人员尝试启动及恢复未果,当我介入时发现,导致ASM无法正常启动的根本原因是集群无法启动。
SQL> startup nomount ORA-01078: Message 1078 not found; No message file for product=RDBMS, facility=ORA ORA-29701: Message 29701 not found; No message file for product=RDBMS, facility=ORA SQL> exit -bash-4.1# ./crsctl start crs CRS-4640: Oracle High Availability Services is already active CRS-4000: Command Start failed, or completed with errors. -bash-4.1# ./crsctl stop crs CRS-2796: The command may not proceed when Cluster Ready Services is not running CRS-4687: Shutdown command has completed with error(s). CRS-4000: Command Stop failed, or completed with errors.
经过分析发现,节点2的hosts文件错误,私有IP配置的是一个不存在的IP地址,根据节点1的hosts文件及两台主机配置的IP地址调整后,问题依旧没有解决,尝试重新编译has,遇到libskgxns.a不存在错误。
-bash-4.1# ./roothas.pl 2015-01-27 16:05:35: Checking for super user privileges 2015-01-27 16:05:35: User has super user privileges 2015-01-27 16:05:35: Parsing the host name Using configuration parameter file: ./crsconfig_params The oracle binary is currently linked with RAC enabled. Please execute the following steps to relink oracle binary and rerun the command with RAC disabled: cd <crshome> setenv ORACLE_HOME pwd cd rdbms/lib make -f ins_rdbms.mk rac_off ioracle
根据上面的提示做make -f ins_rdbms.mk rac_off ioracle操作时,报错。
-bash-4.1# pwd /opt/app/11.2.0/grid -bash-4.1# export ORACLE_HOME=/opt/app/11.2.0/grid -bash-4.1# cd rdbms/lib/ -bash-4.1# make -f ins_rdbms.mk rac_off ioracle rm -f /opt/app/11.2.0/grid/lib/libskgxp11.so cp /opt/app/11.2.0/grid/lib//libskgxpg.so /opt/app/11.2.0/grid/lib/libskgxp11.so rm -f /opt/app/11.2.0/grid/lib/libskgxn2.a cp /opt/app/11.2.0/grid/lib//libskgxnr.a /opt/app/11.2.0/grid/lib/libskgxn2.a rm -f /opt/app/11.2.0/grid/lib/libskgxn2.a cp /opt/app/11.2.0/grid/lib//libskgxns.a /opt/app/11.2.0/grid/lib/libskgxn2.a cp: /opt/app/11.2.0/grid/lib//libskgxns.a: A file or directory in the path name does not exist. make: 1254-004 The error code from the last command is 1.
/opt/app/11.2.0/grid/lib目录下的确不存在libskgxns.a文件。
-bash-4.1$ ls -l /opt/app/11.2.0/grid/lib/libskgxns.a ls: 0653-341 The file /opt/app/11.2.0/grid/lib/libskgxns.a does not exist.
查找MOS发现这是个BUG(Bug 9777859),详见RAC Turned off and relink with missing libskgxns.a file (Doc ID 1290438.1),解决方法是将$GRID_HOME/rdbms/lib目录下的同名文件拷贝到$GRID_HOME/lib目录即可。
-bash-4.1$ cd /opt/app/11.2.0/grid/rdbms/lib -bash-4.1$ cp libskgxns.a /opt/app/11.2.0/grid/lib
然后即可成功make -f ins_rdbms.mk rac_off ioracle了。
-bash-4.1# make -f ins_rdbms.mk rac_off ioracle rm -f /opt/app/11.2.0/grid/lib/libskgxp11.so cp /opt/app/11.2.0/grid/lib//libskgxpg.so /opt/app/11.2.0/grid/lib/libskgxp11.so rm -f /opt/app/11.2.0/grid/lib/libskgxn2.a cp /opt/app/11.2.0/grid/lib//libskgxnr.a /opt/app/11.2.0/grid/lib/libskgxn2.a rm -f /opt/app/11.2.0/grid/lib/libskgxn2.a cp /opt/app/11.2.0/grid/lib//libskgxns.a /opt/app/11.2.0/grid/lib/libskgxn2.a /bin/ar -X64 d /opt/app/11.2.0/grid/rdbms/lib/libknlopt.a kcsm.o /bin/ar -X64 cr /opt/app/11.2.0/grid/rdbms/lib/libknlopt.a /opt/app/11.2.0/grid/rdbms/lib/ksnkcs.o Target "rac_off" is up to date. chmod 755 /opt/app/11.2.0/grid/bin - Linking Oracle rm -f /opt/app/11.2.0/grid/rdbms/lib/oracle ... ...
再次编译has。
-bash-4.1# ./roothas.pl 2015-01-27 16:42:23: Checking for super user privileges 2015-01-27 16:42:23: User has super user privileges 2015-01-27 16:42:23: Parsing the host name Using configuration parameter file: ./crsconfig_params LOCAL ADD MODE Creating OCR keys for user 'grid', privgrp 'dba'.. Operation successful. CRS-4664: Node yunsuan2 successfully pinned. Adding daemon to inittab CRS-4123: Oracle High Availability Services has been started. ohasd is starting yunsuan2 2015/01/27 16:43:03 /opt/app/11.2.0/grid/cdata/yunsuan2/backup_20150127_164303.olr Successfully configured Oracle Grid Infrastructure for a Standalone Server
has成功重新编译,集群已经可以启动了。
-bash-4.1# ./crs_stat -t Name Type Target State Host ------------------------------------------------------------ ora.cssd ora.cssd.type OFFLINE OFFLINE ora.diskmon ora....on.type OFFLINE OFFLINE