注意,跟踪目录下的arb0的跟踪文件可能会有很多,因此我们需要知道arb0的OS是进程号,是哪一个arb0在实际做rebalance的工作,这个信息在ASM实例执行rebalance操作的时候,alert文件中会有显示。我们还可以通过操作系统命令pstack来跟踪ARB0进程,查看具体它在做什么,如下,它向我们显示了,ASM正在重分配extent(在堆栈中的关键函数 kfgbRebalExecute - kfdaExecute - kffRelocate):
[root@jyrac1 ~]# pstack 25416
#0 0x0000003aa88005f4 in ?? () from /usr/lib64/libaio.so.1
#1 0x0000000002bb9b11 in skgfrliopo ()
#2 0x0000000002bb9909 in skgfospo ()
#3 0x00000000086c595f in skgfrwat ()
#4 0x00000000085a4f79 in ksfdwtio ()
#5 0x000000000220b2a3 in ksfdwat_internal ()
#6 0x0000000003ee7f33 in kfk_reap_ufs_async_io ()
#7 0x0000000003ee7e7b in kfk_reap_ios_from_subsys ()
#8 0x0000000000aea0ac in kfk_reap_ios ()
#9 0x0000000003ee749e in kfk_io1 ()
#10 0x0000000003ee7044 in kfkRequest ()
#11 0x0000000003eed84a in kfk_transitIO ()
#12 0x0000000003e40e7a in kffRelocateWait ()
#13 0x0000000003e67d12 in kffRelocate ()
#14 0x0000000003ddd3fb in kfdaExecute ()
#15 0x0000000003ec075b in kfgbRebalExecute ()
#16 0x0000000003ead530 in kfgbDriver ()
#17 0x00000000021b37df in ksbabs ()
#18 0x0000000003ec4768 in kfgbRun ()
#19 0x00000000021b8553 in ksbrdp ()
#20 0x00000000023deff7 in opirip ()
#21 0x00000000016898bd in opidrv ()
#22 0x0000000001c6357f in sou2o ()
#23 0x00000000008523ca in opimai_real ()
#24 0x0000000001c6989d in ssthrdmain ()
#25 0x00000000008522c1 in main ()
Compacting
在下面的例子里,我们来看下rebalance的compacting阶段,我把上面删除的磁盘加回来,同时设置rebalance的power为2:
17:26:48 SQL> alter diskgroup testdg add disk '/dev/raw/raw7' rebalance power 2;
Diskgroup altered.
ASM给出的rebalance的估算时间为6分钟:
16:07:13 SQL> select INST_ID, OPERATION, STATE, POWER, SOFAR, EST_WORK, EST_RATE, EST_MINUTES from GV$ASM_OPERATION where GROUP_NUMBER=1;
INST_ID OPERA STAT POWER SOFAR EST_WORK EST_RATE EST_MINUTES
---------- ----- ---- ---------- ---------- ---------- ---------- -----------
1 REBAL RUN 10 489 53851 7920 6
大约10秒后,EST_MINUTES的值变为0.
16:07:23 SQL> /