现在登入到grid用户,确定下ASM磁盘组的状态:
sqlplus / as sysasm 直接查询v$asm_diskgroup;
发现OCR_VOTE1磁盘组在两个ASM实例上都是没有mount;
SQL> select instance_name from v$instance; INSTANCE_NAME ------------------------------------------------ +ASM2 SQL> select name, state, total_mb, free_mb from v$asm_diskgroup; NAME STATE TOTAL_MB FREE_MB ------------------------------ --------------------------------- ---------- ---------- DATA MOUNTED 737280 88152 FRA_ARCHIVE MOUNTED 10240 9287 OCR_VOTE1 DISMOUNTED 0 0另一个节点一样;
节点mount OCR相关磁盘组 SQL> select name, state from v$asm_diskgroup; NAME STATE ------------------------------ --------------------------------- DATA MOUNTED FRA_ARCHIVE MOUNTED OCR_VOTE1 DISMOUNTED再确认下目前GI的一些核心后台进程:
#发现crs这个进程是没有启动的,查询没有任何结果输出 root@bjdb1:/>ps -ef|grep crsd.bin|grep -v grep同样,节点2查询也是一样没有启动crs进程。
简单总结问题现状:故障发生在10月3日 下午18:04左右,所有节点都因为无法访问共享存储进而导致OCR初始化失败。目前的crs进程是没有正常启动的。
3.处理问题3.1 尝试手工挂载OCR磁盘组
SQL> alter diskgroup ocr_vote1 mount; Diskgroup altered. SQL> select name, state from v$asm_diskgroup; NAME STATE ------------------------------ --------------------------------- DATA MOUNTED FRA_ARCHIVE MOUNTED OCR_VOTE1 MOUNTED3.2 节点1启动CRS
目前,crs这个进程依然是没有启动的,
节点1尝试正常开启crs失败
root@bjdb1:/>crsctl start crs CRS-4640: Oracle High Availability Services is already active CRS-4000: Command Start failed, or completed with errors.节点1尝试正常关闭crs失败
root@bjdb1:/>crsctl stop crs CRS-2796: The command may not proceed when Cluster Ready Services is not running CRS-4687: Shutdown command has completed with errors. CRS-4000: Command Stop failed, or completed with errors.