11g RAC 節點二MMON進程異常
阿新 • • 發佈:2018-09-20
red epo let time free lin 滿了 sel ping 一早發現核心系統的DBtime監控閾值一直在某一個點平移,感覺有點不對勁。
因為我們的腳本依托dba_hist_snapshot試圖的SNIP來做的。遂進行AWR報告的生成查看其SNAP_ID是否有異常;
因為我們的腳本依托dba_hist_snapshot試圖的SNIP來做的。遂進行AWR報告的生成查看其SNAP_ID是否有異常;
21220 19 Sep 2018 09:00 1 21221 19 Sep 2018 10:00 1 21222 19 Sep 2018 11:00 1 21223 19 Sep 2018 12:00 1 21224 19 Sep 2018 13:00 1 21225 19 Sep 2018 14:00 1 21226 19 Sep 2018 15:00 1 21227 19 Sep 2018 16:00 1 21228 19 Sep 2018 17:00 1 21229 19 Sep 2018 18:00 1 21230 19 Sep 2018 19:00 1
Specify the Begin and End Snapshot Ids
Enter value for begin_snap: 昨天晚上系統確實是有CBC相關的等待,不過很快就恢復了。這是什麽情況,難道是數據庫歸檔滿了,或者是mm進程down了?試著手動生成個SNAP_ID試試。發現是可以的。 [oracle@bapdb2 trace]$ sqlplus / as sysdba SQL*Plus: Release 11.2.0.4.0 Production on Thu Sep 20 10:40:33 2018 Copyright (c) 1982, 2013, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP, Data Mining and Real Application Testing options 10:40:33 SYS@bapdb2(bapdb2)> set line 300 pages 1000 10:40:35 SYS@bapdb2(bapdb2)> BEGIN 10:40:37 2 DBMS_WORKLOAD_REPOSITORY.CREATE_SNAPSHOT (); 10:40:37 3 END; 10:40:37 4 / PL/SQL procedure successfully completed. 系統內的歸檔目錄也很充足,不存在歸檔異常導致進程異常的情況; 10:43:57 SYS@b2(db2)> select group_number,block_size,name,allocation_unit_size,state,type,total_mb,free_mb,offline_disks from v$asm_diskgroup; GROUP_NUMBER BLOCK_SIZE NAME ALLOCATION_UNIT_SIZE STATE TYPE TOTAL_MB FREE_MB OFFLINE_DISKS ------------ ---------- ------------------------------ -------------------- ----------- ------ ---------- ---------- ------------- 1 4096 SAS_ARCH 1048576 CONNECTED EXTERN 1024000 617921 0 節點一查看進程: [oracle@db1 ~]$ ps -ef |grep mm grid 6634 1 0 2017 ? 00:33:47 asm_mman_+ASM1 grid 6648 1 0 2017 ? 01:52:06 asm_mmon_+ASM1 grid 6650 1 0 2017 ? 2-00:53:46 asm_mmnl_+ASM1 oracle 8610 1 0 2017 ? 00:33:56 ora_mman_db1 oracle 8650 1 0 2017 ? 3-11:28:35 ora_mmon_db1 oracle 8655 1 1 2017 ? 4-07:20:56 ora_mmnl_db1 節點二查看進程: [oracle@bapdb2 ~]$ ps -ef |grep mm oracle 54354 53982 0 11:09 pts/1 00:00:00 grep mm grid 105256 1 0 2017 ? 00:23:52 asm_mman_+ASM2 grid 105295 1 0 2017 ? 01:15:06 asm_mmon_+ASM2 grid 105312 1 0 2017 ? 1-03:49:26 asm_mmnl_+ASM2 oracle 106889 1 0 2017 ? 00:28:00 ora_mman_db2 oracle 106927 1 0 2017 ? 3-04:47:42 ora_mmnl_db2 發現節點二的MMON進程DOWN了。從ALERT日誌進行搜索: Tue Sep 19 03:49:00 2017 MMON started with pid=36, OS id=8650 Tue Sep 19 03:49:00 2017 MMNL started with pid=37, OS id=8655 Tue Sep 19 04:01:47 2017 MMON started with pid=36, OS id=106923 Tue Sep 19 04:01:47 2017 MMNL started with pid=37, OS id=106927 這個id為106923的進程確實是異常了。之前處理過類似的情況,可以在節點二直接啟動MMON相關進程; SQL> alter system enable restricted session; System altered. SQL> alter system disable restricted session; System altered. 同時Alert日誌也給出了反饋; Thu Sep 20 11:10:28 2018 Stopping background process MMNL Starting background process MMON Starting background process MMNL Thu Sep 20 11:10:29 2018 MMON started with pid=37, OS id=55936 Thu Sep 20 11:10:29 2018 MMNL started with pid=236, OS id=55938 ALTER SYSTEM enable restricted session; minact-scn: Inst 2 is a slave inc#:16 mmon proc-id:55936 status:0x2 minact-scn status: grec-scn:0x0026.4dcf0d36 gmin-scn:0x0026.4dcf0d36 gcalc-scn:0x0026.4dcf1208 Thu Sep 20 11:11:05 2018 ALTER SYSTEM disable restricted session; Thu Sep 20 11:13:25 2018 LGWR: Standby redo logfile selected for thread 2 sequence 154126 for destination LOG_ARCHIVE_DEST_3 再次查看進程啟動正常 11:10:29 SYS@db2(xxxdb2)> !ps -ef |grep mm oracle 55936 1 0 11:10 ? 00:00:00 ora_mmon_db2 oracle 55938 1 0 11:10 ? 00:00:00 ora_mmnl_db2 grid 105256 1 0 2017 ? 00:23:52 asm_mman_+ASM2 grid 105295 1 0 2017 ? 01:15:06 asm_mmon_+ASM2 grid 105312 1 0 2017 ? 1-03:49:26 asm_mmnl_+ASM2 oracle 106889 1 0 2017 ? 00:28:00 ora_mman_db2 追查了一下MMON進程的trc文件,發現最下面有這一條: *** 2018-09-19 18:46:41.432 minact-scn slave-status: grec-scn:0x0026.4db016c0 gmin-scn:0x0026.4db016c0 gcalc-scn:0x0026.4db0273c minact-scn slave-status: grec-scn:0x0026.4dbdde59 gmin-scn:0x0026.4dbdde59 gcalc-scn:0x0026.4dbdf492 *** 2018-09-19 18:56:44.302 minact-scn slave-status: grec-scn:0x0026.4dca45db gmin-scn:0x0026.4dca45db gcalc-scn:0x0026.4dca5990 *** 2018-09-19 19:01:37.026 error 28 detected in background process OPIRIP: Uncaught error 447. Error stack: ORA-00447: fatal error in background process ORA-00028: your session has been killed 猜想是因為這個問題: Fixed Objects Statistics (GATHER_FIXED_OBJECTS_STATS) Considerations (文檔 ID 798257.1)
11g RAC 節點二MMON進程異常