在Oracle 11gR2的版本中,新增一个命令,可以对正常mount的磁盘组进行检查,这对我们去维护asm方式的数据库提供了很多便利
有以下场合非常适用:
1:主机服务器例行维护,需要重启主机和数据库,这时如果不进行检查的话,很有可能导致asm实例磁盘组加载不了,导致严重的事故。
例如磁盘头损坏等,这个时候在重启前我们加以检查的话,就可以发现问题,及时预警,做数据备份等。
测试如下:
模拟磁盘头损坏
[grid@12cdb1 ~]$ dd if=/dev/zero of=/dev/sde1 bs=4096 count=1
1+0 records in
1+0 records out
4096 bytes (4.1 kB) copied, 0.000932074 s, 4.4 MB/s
[grid@12cdb1 ~]$
[grid@12cdb1 ~]$
[grid@12cdb1 ~]$ kfed read /dev/sde1|more
kfbh.endian: 0 ; 0x000: 0x00
kfbh.hard: 0 ; 0x001: 0x00
kfbh.type: 0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt: 0 ; 0x003: 0x00
kfbh.block.blk: 0 ; 0x004: blk=0
kfbh.block.obj: 0 ; 0x008: file=0
kfbh.check: 0 ; 0x00c: 0x00000000
kfbh.fcn.base: 0 ; 0x010: 0x00000000
kfbh.fcn.wrap: 0 ; 0x014: 0x00000000
kfbh.spare1: 0 ; 0x018: 0x00000000
kfbh.spare2: 0 ; 0x01c: 0x00000000
000000000 00000000 00000000 00000000 00000000 [................]
Repeat 255 times
KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]
asm实例中进行检查:
SQL> alter diskgroup dgtest check;
Diskgroup altered.
这时我们查看asm实例中日志信息:
可以看到红色部分就显示不能读取磁盘头信息,这个时候我们就要介入处理了。
SQL> alter diskgroup dgtest check
Tue Jul 22 17:54:47 2014
ERROR: Could not read the header of disk DGTEST_0000 (0).
NOTE: process _user12084_+asm (12084) initiating offline of disk 0.3914034847 (DGTEST_0000) with mask 0x7e in group 2 (DGTEST) without client assisting
NOTE: initiating PST update: grp 2 (DGTEST), dsk = 0/0xe94b6e9f, mask = 0x6a, op = clear
Tue Jul 22 17:54:47 2014
GMON updating disk modes for group 2 at 24 for pid 23, osid 12084
ERROR: disk 0(DGTEST_0000) in group 2(DGTEST) cannot be offlined because the disk group has external redundancy.
Tue Jul 22 17:54:47 2014
ERROR: too many offline disks in PST (grp 2)
Tue Jul 22 17:54:47 2014
ERROR: Failed to offline disk DGTEST_0000 (0).
NOTE: starting check of diskgroup DGTEST
Tue Jul 22 17:54:48 2014
ASM Health Checker found 1 new failures
Tue Jul 22 17:54:48 2014
ASM Health Checker found 1 new failures
Tue Jul 22 17:54:48 2014
GMON checking disk 0 for group 2 at 25 for pid 23, osid 12084
修改磁盘头后,我们再次做检查
没有任何报错
SQL> alter diskgroup dgtest check
Tue Jul 22 18:07:50 2014
NOTE: starting check of diskgroup DGTEST
Tue Jul 22 18:07:51 2014
GMON checking disk 0 for group 2 at 26 for pid 23, osid 12084
Tue Jul 22 18:07:53 2014
SUCCESS: check of diskgroup DGTEST found no errors
Tue Jul 22 18:07:53 2014
SUCCESS: alter diskgroup dgtest check
在CentOS 6.4下安装Oracle 11gR2(x64)