Solaris Server 系统工具explorer hang住的解决办法。 线上机器做health check,run explorer 发现有几台servers的explorer夯住了。如下:
Nov 13 17:56:10 localhost[10419] sunray: RUNNING
Nov 13 17:56:10 localhost[10419] sunray: exited: SUNWuto not installed
Nov 13 17:56:10 localhost[10419] t3: RUNNING
Nov 13 17:56:10 localhost[10419] t3: exited: T3 and T4 not installed
Nov 13 17:56:10 localhost[10419] t3extended: RUNNING
Nov 13 17:56:10 localhost[10419] tape: RUNNING
Nov 13 17:56:11 localhost[10419] Tx000: RUNNING
Nov 13 17:58:06 localhost[10419] u4ft: RUNNING
Nov 13 17:58:06 localhost[10419] u4ft: exited: Not an FT1800 system
Nov 13 17:58:06 localhost[10419] var: RUNNING
Nov 13 17:58:11 localhost[10419] vtsst: RUNNING
Nov 13 17:58:11 localhost[10419] vtsst: exited: StorTools Diagnostics not installed
Nov 13 17:58:11 localhost[10419] vxfs: RUNNING
然后开一个新的Terminal,登陆机器,用ps -ef | grep 夯的进程。
root@localhost:~ # ps -ef | grep vxfs
root 5544 1 0 Sep 23 ? 0:00 /opt/VRTSvxfs/sbin/vxfsckd
root 4528 4460 0 18:11:22 pts/3 0:00 grep vxfs
root 28036 10419 0 17:58:12 pts/1 0:00 ksh -p /opt/SUNWexplo/tools/vxfs
root 2268 28036 1 18:06:36 pts/1 0:46 /usr/lib/fs/vxfs/fsadm -ED /search
kill掉vxfs之后,explorer就完成了。
Nov 13 17:58:06 localhost[10419] var: RUNNING
Nov 13 17:58:11 localhost[10419] vtsst: RUNNING
Nov 13 17:58:11 localhost[10419] vtsst: exited: StorTools Diagnostics not installed
Nov 13 17:58:11 localhost[10419] vxfs: RUNNING
Nov 13 18:11:46 localhost[10419] vxvm: RUNNING
Nov 13 18:11:59 localhost[10419] xscfextended: RUNNING
Nov 13 18:11:59 localhost[10419] ilomsnapshot_finish: RUNNING
Nov 13 18:11:59 localhost[10419] explorer: data collection complete
Nov 13 18:12:22 localhost[10419] explorer: = = = stderr output from explorer = = =
Nov 13 18:12:22 localhost[10419] explorer: /opt/SUNWexplo/bin/explorer[361]: 28036 Killed
Nov 13 18:12:22 localhost[10419] explorer: removing previous explorers from /opt/SUNWexplo/output
Nov 13 18:12:22 localhost[10419] explorer: Explorer finished
夯住的位置是一个veritas的disk group,估计在这里有点问题。无法采集到数据,就停在那里,这也是一个很好的判断问题的方法。
Solaris这方面就很完善,总之能收集到硬件软件数据确定Server问题,Linux上好像没有这样类似的采集工具,只有各个服务器厂商自己开发的服务器检查工具。