由于DG Broker的配置不对导致RAC某实例无法mount

今天碰到一个我自己实验室发生的故障,起初看起来很简单,但实际上还很有趣,而且不细心的话还容易被忽视掉。相信在生产环境也会有客户会实际遇到。

环境:Oracle 11.2.0.4 RAC (2 nodes Primary + 2 nodes Standby)
背景:起初这个实验环境搭建好是没有任何问题的。后期做过各类测试,其中包括主库增加了新的存储目录,所以现在需要修改备库的db_file_name_convert参数,添加对应各自的关系。

本来修改个参数没太在意,当时重启数据库也是成功的,结果后来standby数据库又一次重启后,standby的两个节点,其中一个节点启动正常,另外一个节点居然起不来,报错如下:

SQL> shutdown abort ORACLE instance shut down. SQL> startup ORACLE instance started. Total System Global Area 534462464 bytes Fixed Size 2254952 bytes Variable Size 444598168 bytes Database Buffers 83886080 bytes Redo Buffers 3723264 bytes ORA-01105: mount is incompatible with mounts by other instances ORA-01677: standby file name convert parameters differ from other instance

报错信息非常明显,就是说convert参数配置和其他实例不一致,检查的确如此:
起不来的实例convert参数如下:

SQL> show parameter convert NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ db_file_name_convert string +data1/jyzhao, +data/mynas log_file_name_convert string +data1/jyzhao, +data/mynas, +f ra1/jyzhao, +fra/mynas

正常启动的实例convert参数如下:

SQL> show parameter convert NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ db_file_name_convert string +fra1/jyzhao, +fra/mynas, +dat a1/jyzhao, +data/mynas log_file_name_convert string +data1/jyzhao, +data/mynas, +f ra1/jyzhao, +fra/mynas

很明显,db_file_name_convert参数两个实例的值不一致。

看到这里,第一反应就是,看下两边参数文件是不是一致?
答案是参数文件内容完全一致,而且参数文件对应的db_file_name_convert参数值是。

*.db_file_name_convert='+data1/jyzhao','+data/mynas'

这里补充下背景:之前Standby RAC的参数值是 *.db_file_name_convert='+data1/jyzhao','+data/mynas'
后来修改过备库的参数值:

SQL> alter system set db_file_name_convert = '+fra1/jyzhao', '+fra/mynas', '+data1/jyzhao', '+data/mynas' scope=spfile;

但是如今看来并没有生效。难道是上次修改过程中有什么疏忽的地方?
再次修改为正确的值测试:

SQL> alter system set db_file_name_convert = '+fra1/jyzhao', '+fra/mynas', '+data1/jyzhao', '+data/mynas' scope=spfile; System altered. SQL> shutdown abort ORACLE instance shut down. SQL> startup ORACLE instance started. Total System Global Area 534462464 bytes Fixed Size 2254952 bytes Variable Size 444598168 bytes Database Buffers 83886080 bytes Redo Buffers 3723264 bytes Database mounted. Database opened. SQL> SQL> show parameter convert NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ db_file_name_convert string +fra1/jyzhao, +fra/mynas, +dat a1/jyzhao, +data/mynas log_file_name_convert string +data1/jyzhao, +data/mynas, +f ra1/jyzhao, +fra/mynas

可以看到,再次修改后,这个实例可以正常启动了,参数值也显示正确了。
那上次是什么情况?再次关闭实例后重启,发现问题重现:

SQL> SQL> SQL> shutdown abort ORACLE instance shut down. SQL> startup ORACLE instance started. Total System Global Area 534462464 bytes Fixed Size 2254952 bytes Variable Size 444598168 bytes Database Buffers 83886080 bytes Redo Buffers 3723264 bytes ORA-01105: mount is incompatible with mounts by other instances ORA-01677: standby file name convert parameters differ from other instance SQL> show parameter convert NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ db_file_name_convert string +data1/jyzhao, +data/mynas log_file_name_convert string +data1/jyzhao, +data/mynas, +f ra1/jyzhao, +fra/mynas SQL> select status from v$instance; STATUS ------------ STARTED

最后在数据库的alert中找到了蛛丝马迹,就是手工修改参数后,启动数据库成功后,又显示自动被修改为原值。

Wed Jan 03 05:47:06 2018 Starting Data Guard Broker (DMON) Wed Jan 03 05:47:06 2018 INSV started with pid=41, OS id=26314 Physical standby database opened for read only access. Completed: ALTER DATABASE OPEN Wed Jan 03 05:47:10 2018 NSV0 started with pid=42, OS id=26321 Wed Jan 03 05:47:14 2018 RSM0 started with pid=43, OS id=26331 Using STANDBY_ARCHIVE_DEST parameter default value as USE_DB_RECOVERY_FILE_DEST ALTER SYSTEM SET log_archive_trace=0 SCOPE=BOTH SID='jyzhao1'; ALTER SYSTEM SET log_archive_format='%t_%s_%r.dbf' SCOPE=SPFILE SID='jyzhao1'; ALTER SYSTEM SET standby_file_management='AUTO' SCOPE=BOTH SID='*'; ALTER SYSTEM SET archive_lag_target=0 SCOPE=BOTH SID='*'; ALTER SYSTEM SET log_archive_max_processes=4 SCOPE=BOTH SID='*'; ALTER SYSTEM SET log_archive_min_succeed_dest=1 SCOPE=BOTH SID='*'; ALTER SYSTEM SET db_file_name_convert='+data1/jyzhao','+data/mynas' SCOPE=SPFILE; ALTER SYSTEM SET log_file_name_convert='+data1/jyzhao','+data/mynas','+fra1/jyzhao','+fra/mynas' SCOPE=SPFILE; ALTER SYSTEM SET fal_server='jyzhao' SCOPE=BOTH; Wed Jan 03 05:47:53 2018 Decreasing number of real time LMS from 1 to 0

看到这里,已经知道问题的答案了。
大家也可以思考下,这段alert说明了什么?

内容版权声明:除非注明,否则皆为本站原创文章。

转载注明出处:https://www.heiqu.com/zwgwgj.html