1. 程式人生 > 其它 >11g備庫無法開啟ADG的原因分析 (r7筆記第62天)

11g備庫無法開啟ADG的原因分析 (r7筆記第62天)

今天碰到一個有些奇怪的問題,但是奇怪的現象背後都是有本質的因果。 下午在做一個環境的檢查時,發現備庫是在mount階段,這可是一個11gR2的庫,沒有ADG實在是太浪費了,對於這種情況感覺太不應該了。 所以嘗試啟動至open階段,發現狀態一直是read only,在ADG中應該是READ ONLY WITH APPLY才對啊。 使用dg broker設定為READ-ONLY,備庫的資料庫日誌如下: Standby Database: stestdb3, Enabled Physical Standby (0x02010000) 08/14/2014 16:03:28 version check on database stestdb3 detected stale metadata, requesting update from primary database Creating process RSM0 12/29/2015 16:28:11 Command EDIT DATABASE stestdb3 SET STATE = READ-ONLY completed Read-Only state no longer supported 12/29/2015 16:29:10 似乎也看不出來什麼端倪。使用dg broker檢視一下。發現報了下面的錯誤。 DGMGRL> show configuration; Configuration - testdb Protection Mode: MaxPerformance Databases: testdbbak93 - Primary database stestdb3 - Physical standby database Error: ORA-16766: Redo Apply is stopped Fast-Start Failover: DISABLED Configuration Status: ERROR

檢視dg broker的日誌如下: Data Guard Broker initializing... Data Guard Broker initialization complete Tue Dec 29 16:47:15 2015 SMON: enabling cache recovery No Resource Manager plan active Physical standby database opened for read only access. Completed: alter database open Tue Dec 29 16:47:16 2015 idle dispatcher 'D000' terminated, pid = (18, 1) Tue Dec 29 16:51:40 2015 Primary database is in MAXIMUM PERFORMANCE mode RFS[3]: Assigned to RFS process 3596 RFS[3]: Selected log 7 for thread 1 sequence 72606 dbid -1549369665 branch 746558785 Tue Dec 29 16:51:41 2015 RFS[4]: Assigned to RFS process 3590 RFS[4]: Selected log 8 for thread 1 sequence 72605 dbid -1549369665 branch 746558785 Tue Dec 29 16:51:42 2015 Archived Log entry 69432 added for thread 1 sequence 72605 ID 0xa829ec3b dest 2:
從上面的情況可以很明顯看到,確實MRP沒有開始工作,只有RFS在接收歸檔。 然後使用dg broker把備庫設定為ONLINE狀態,再次檢視dg broker的檢查,發現檢查就沒有問題了。 DGMGRL> show configuration; Configuration - testdb Protection Mode: MaxPerformance Databases: testdbbak93 - Primary database stestdb3 - Physical standby database Fast-Start Failover: DISABLED Configuration Status: SUCCESS
總體感覺這不是一個11g的庫。 然後再次嘗試,手工啟動到open階段,然後可以看到備庫還是READ ONLY,重啟之後問題依然存在。 對於這個問題,最好的方式也還是檢視日誌,這個備庫是一年前重啟的了,慶幸的是資料庫日誌依然存在。從當時的啟動情況來看,也沒有其它的錯誤。 但是我注意到了compatible這個引數,因為在11g的庫中還是比較顯眼的。所以這個引數引起了我的好奇。 結果帶著疑問在MOS一查,果然有幾篇相關的文章,看來又碰上一個遺留問題,而且有一個相關的BUG描述。 ACTIVE DATAGUARD (ADG) NOT POSSIBLE WITH COMPATIBLE < 11.1.0.0.0 (Doc ID 1363396.1) BUG:13032521 - ADG PHYSICAL STANDBY GOES TO MOUNT STATE INSTEAD OF READ ONLY WITH APPLY 問題基本定位後,主備庫中檢視這個引數都是10.2.0.5.0 SQL> show parameter compa NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ compatible string 10.2.0.5.0 那麼按照bug描述的WA,是設定備庫的compatible為11.1.0.7以上,這個引數的修改需要重啟例項,所以還是比較影響的,主庫目前是沒法重啟了。 SQL> alter system set compatible='11.2.0.3.0'; alter system set compatible='11.2.0.3.0' * ERROR at line 1: ORA-02095: specified initialization parameter cannot be modified 現在備庫設定一番,先看看行不行。 SQL> alter system set compatible='11.2.0.3.0' scope=spfile; System altered. 重啟時,可以看到備庫的資料庫日誌有下面這麼一段輸出。 Tue Dec 29 17:25:26 2015 Spfile /U01/app/oracle/product/11.2.3/db_1/dbs/spfiletestdb.ora is in old pre-11 format and compatible >= 11.0.0; converting to new H.A.R.D. compliant format. Completed: alter database mount 但是再次設定為ONLINE,檢視資料庫狀態依舊是MOUNT SQL> select open_mode from v$database; OPEN_MODE -------------------- READ ONLY 看來備庫修改還不行,主庫也得修改一致。 不過檢視資料庫日誌可以看到下面的這麼一段內容,發現MRP啟動失敗。 ALTER DATABASE RECOVER MANAGED STANDBY DATABASE THROUGH ALL SWITCHOVER DISCONNECT USING CURRENT LOGFILE Attempt to start background Managed Standby Recovery process (testdb) Tue Dec 29 17:57:03 2015 MRP0 started with pid=29, OS id=17740 MRP0: Background Managed Standby Recovery process started (testdb) started logmerger process Tue Dec 29 17:57:08 2015 Managed Standby Recovery starting Real Time Apply Parallel Media Recovery started with 16 slaves Waiting for all non-current ORLs to be archived... All non-current ORLs have been archived. Media Recovery Log /U01/app/oracle/fra/StestDB3/archivelog/2015_12_29/o1_mf_1_72606_c84n0xml_.arc Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE THROUGH ALL SWITCHOVER DISCONNECT USING CURRENT LOGFILE Errors with log /U01/app/oracle/fra/StestDB3/archivelog/2015_12_29/o1_mf_1_72606_c84n0xml_.arc MRP0: Background Media Recovery terminated with error 38800 Errors in file /U01/app/oracle/diag/rdbms/stestdb3/testdb/trace/testdb_pr00_17745.trc: ORA-38800: Cannot start Redo Apply on the open physical standby database Managed Standby Recovery not using Real Time Apply Recovery interrupted! MRP0: Background Media Recovery process shutdown (testdb) 看來這個引數變化影響確實不小,備庫先恢復正常狀態再說,等協調主庫重啟再處理了,所以開始恢復引數原有的設定。把compatible設定為10.2.0.5.0 ?但是重啟的時候就開始報錯了。 SQL> alter database mount; alter database mount * ERROR at line 1: ORA-00201: control file version 11.2.0.3.0 incompatible with ORACLE version 10.2.0.5.0 ORA-00202: control file: '/U01/app/oracle/oradata/testdb/control01.ctl' 這個問題看似還有餘地,在主庫生成備庫控制檔案,傳輸過去,mount就沒有問題了 主庫: SQL> alter database create standby controlfile as '/tmp/std1.ctl'; Database altered. ?備庫: SQL> alter database mount standby database; Database altered. 但是這個時候檢視備庫的資料庫日誌,發現問題貌似變麻煩了。檔案頭部已經修改,已經不同步了。 ALTER DATABASE RECOVER managed standby database disconnect from session Attempt to start background Managed Standby Recovery process (testdb) Tue Dec 29 18:28:13 2015 MRP0 started with pid=30, OS id=24283 MRP0: Background Managed Standby Recovery process started (testdb) started logmerger process Tue Dec 29 18:28:18 2015 Managed Standby Recovery not using Real Time Apply Read of datafile '/U01/app/oracle/oradata/testdb/system01.dbf' (fno 1) header failed with ORA-01130 Rereading datafile 1 header failed with ORA-01130 MRP0: Background Media Recovery terminated with error 1110 Errors in file /U01/app/oracle/diag/rdbms/stestdb3/testdb/trace/testdb_pr00_24288.trc: ORA-01110: data file 1: '/U01/app/oracle/oradata/testdb/system01.dbf' ORA-01122: database file 1 failed verification check ORA-01110: data file 1: '/U01/app/oracle/oradata/testdb/system01.dbf' ORA-01130: database file version 11.2.0.3.0 incompatible with ORACLE version 10.2.0.5.0 Slave exiting with ORA-1110 exception Errors in file /U01/app/oracle/diag/rdbms/stestdb3/testdb/trace/testdb_pr00_24288.trc: ORA-01110: data file 1: '/U01/app/oracle/oradata/testdb/system01.dbf' ORA-01122: database file 1 failed verification check ORA-01110: data file 1: '/U01/app/oracle/oradata/testdb/system01.dbf' ORA-01130: database file version 11.2.0.3.0 incompatible with ORACLE version 10.2.0.5.0 Recovery Slave PR00 previously exited with exception 1110 MRP0: Background Media Recovery process shutdown (testdb) Completed: ALTER DATABASE RECOVER managed standby database disconnect from session 對應的trace檔案如下: *** 2015-12-29 18:28:18.495 4320 krsh.c Managed Standby Recovery not using Real Time Apply Read of datafile '/U01/app/oracle/oradata/testdb/system01.dbf' (fno 1) header failed with ORA-01130 Rereading datafile 1 header failed with ORA-01130 V10 STYLE FILE HEADER: Compatibility Vsn = 186647296=0xb200300 Db ID=2745597631=0xa3a67ebf, Db Name='testDB' Activation ID=0=0x0 Control Seq=1=0x1, File size=147200=0x23f00 File Number=1, Blksiz=8192, File Type=3 DATA Tablespace #0 - SYSTEM rel_fn:1 對於這種情況,其實恢復備庫11g的控制檔案,重啟主庫 應該就可以解決了,但是重啟主庫還需要協調時間,找維護視窗,所以不是一蹴而就的事情,那麼這個期間容災是重中之重,一旦主庫出了問題,影響還是不小,所以最後的無奈之舉就是重建備庫。 當然搭建備庫還是可以採用11g的active方式。 rman target sys@xxxxx auxiliary sys@xxxx nocatalog RMAN> duplicate target database for standby from active database nofilenamecheck; ?然後就沒有然後了,就是備庫搭建成功了,看著白忙活一場,心中像打翻了五味瓶。