1. 程式人生 > 實用技巧 >[案例]Oracle11g RAC重啟節點2-rac2,RAC不能正常提供服務

[案例]Oracle11g RAC重啟節點2-rac2,RAC不能正常提供服務

Oracle:11.2.0.4

Linux:RHEL6.8

2節點:rac1、rac2

主機名:ze02db01、ze02db02

故障覆盤:

在節點2-ze02db02,停掉例項rac2

/u01/app/11.2.0/grid/bin/srvctl stop instance -d orcl

/u01/app/11.2.0/grid/bin/srvctl start instance -d orcl

此時,在節點1-ze02db01 ,檢視資料庫CRS狀態不正常

ora.orcl.db
1 ONLINE ONLINE ze02db01 Open

2 ONLINE ONLINE ze02db02 starting ...

然後,我將在節點2-ze02db02

$ sqlplus / as sysdba

>startup

檢視資料庫CRS狀態不正常,嘗試在節點1 對節點2,進行重啟

/u01/app/11.2.0/grid/bin/srvctl stop instance -d orcl

此時:RAC不能對外提供服務

[/u01/app/11.2.0/grid/bin/orarootagent.bin(9878)]CRS-5016:Process "/u01/app/11.2.0/grid/bin/acfsregistrymount" spawned by agent "
/u01/app/11.2.0/grid/bin/orarootagent.bin" for action "check" failed: details at "(:CLSN00010:)" in "/u01/app/11.2.0/grid/log/ze02db01/agent/crsd/orarootagent_root/orarootagent_root.log"2020-07-17 12:12:29.432: [/u01/app/11.2.0/grid/bin/orarootagent.bin(9878)]CRS-5016:Process "/u01/app/11.2.0/grid/bin/acfsregistrymount" spawned by agent "/u01/app/11.2.0/grid/bin/orarootagent.bin
" for action "check" failed: details at "(:CLSN00010:)" in "/u01/app/11.2.0/grid/log/ze02db01/agent/crsd/orarootagent_root/orarootagent_root.log"2020-07-17 12:12:29.636: [/u01/app/11.2.0/grid/bin/orarootagent.bin(9878)]CRS-5016:Process "/u01/app/11.2.0/grid/bin/acfsregistrymount" spawned by agent "/u01/app/11.2.0/grid/bin/orarootagent.bin" for action "check" failed: details at "(:CLSN00010:)" in "/u01/app/11.2.0/grid/log/ze02db01/agent/crsd/orarootagent_root/orarootagent_root.log"2020-07-17 12:12:29.839: [/u01/app/11.2.0/grid/bin/orarootagent.bin(9878)]CRS-5016:Process "/u01/app/11.2.0/grid/bin/acfsregistrymount" spawned by agent "/u01/app/11.2.0/grid/bin/orarootagent.bin" for action "check" failed: details at "(:CLSN00010:)" in "/u01/app/11.2.0/grid/log/ze02db01/agent/crsd/orarootagent_root/orarootagent_root.log"2020-07-17 12:12:30.043: [/u01/app/11.2.0/grid/bin/orarootagent.bin(9878)]CRS-5016:Process "/u01/app/11.2.0/grid/bin/acfsregistrymount" spawned by agent "/u01/app/11.2.0/grid/bin/orarootagent.bin" for action "check" failed: details at "(:CLSN00010:)" in "/u01/app/11.2.0/grid/log/ze02db01/agent/crsd/orarootagent_root/orarootagent_root.log"2020-07-17 18:09:01.504: [crsd(9762)]CRS-2757:Command 'Start' timed out waiting for response from the resource 'ora.orcl.db'. Details at (:CRSPE00111:) {2:9343:228} in /u01/app/11.2.0/grid/log/ze02db01/crsd/crsd.log.
節點1alert日誌資訊
2020-07-17 18:50:45.700: [UiServer][2427864832]{1:2032:53986} Sending message to PE. ctx= 0x7f270000b850, Client PID: 12554
2020-07-17 18:50:45.700: [   CRSPE][2429966080]{1:2032:53986} Cmd : 0x7f270c113cf0 : flags: FORCE_TAG | HOST_TAG | QUEUE_TAG
2020-07-17 18:50:45.700: [   CRSPE][2429966080]{1:2032:53986} Processing PE command id=119569. Description: [Start Resource : 0x7f270c113cf0]
2020-07-17 18:50:45.702: [   CRSPE][2429966080]{1:2032:53986} Filtering duplicate ops: server [ze02db02] state [ONLINE]
2020-07-17 18:50:45.702: [   CRSPE][2429966080]{1:2032:53986} Op 0x7f270c00db10 has 16 WOs
2020-07-17 18:50:45.702: [   CRSPE][2429966080]{1:2032:53986} ICE has queued an operation. Details: Operation [START of [ora.orcl.db 2 1] on [ze02db02] : local=0, unplanned=00x7f270c00db10] c
annot run cause it needs W lock for: WO for Placement Path RI:[ora.orcl.db 2 1] server [ze02db02] target states [ONLINE INTERMEDIATE ], locked by op [START of [ora.orcl.db 2 1] on [ze02db02] : local=0, unplanned=00x7f270c0df540]. 
Owner: CRS-2682: It is locked by 'grid' for command 'Start Resource' issued from 'ze02db02'
2020-07-17 18:50:49.490: [   CRSPE][2429966080]{2:9343:273} Processing PE command id=323. Description: [Stat Resource : 0x7f270c00d8a0]
2020-07-17 18:50:51.506: [   CRSPE][2429966080]{2:9343:274} Processing PE command id=324. Description: [Stat Resource : 0x7f270c145e00]
2020-07-17 18:50:52.410: [   CRSPE][2429966080]{2:9343:275} Processing PE command id=325. Description: [Stat Resource : 0x7f270c145e00]
2020-07-17 18:50:53.070: [   CRSPE][2429966080]{2:9343:276} Processing PE command id=326. Description: [Stat Resource : 0x7f270c145e00]
2020-07-17 18:51:35.517: [UiServer][2425763584] CS(0x7f270400a270)set Properties ( grid,0x7f273c0dac90)
2020-07-17 18:51:35.527: [UiServer][2427864832]{1:2032:53987} Sending message to PE. ctx= 0x7f270000ac30, Client PID: 9882
2020-07-17 18:51:35.528: [   CRSPE][2429966080]{1:2032:53987} Processing PE command id=119570. Description: [Stat Resource : 0x7f270c00d8a0]
2020-07-17 18:51:35.528: [   CRSPE][2429966080]{1:2032:53987} Expression Filter : ((NAME == ora.scan1.vip) AND (LAST_SERVER == ze02db01))
2020-07-17 18:51:35.529: [UiServer][2427864832]{1:2032:53987} Done for ctx=0x7f270000ac30

這個時候,進入SQLPLUS將例項關閉

$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.4.0 Production on Mon Jul 20 16:13:46 2020

Copyright (c) 1982, 2013, Oracle. All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options

SYS@orcl2> shutdown immediate

RAC的資源在節點1,可以正常提供服務

最終發現:

系統multipath -ll多路徑軟體不能讀取共享磁碟

service multipathd restart
start_udev

多路徑正常,重啟節點2的crs、instance、nodeapp、listener 。RAC crs狀態仍然不正常。出現lock問題

最終,通過重啟節點2伺服器, RAC正常