asm 磁碟分割槽丟失恢復----惜分飛
阿新 • • 發佈:2018-11-26
有朋友反饋,他們做了xx儲存的雙活之後,重啟主機發現gi無法正常啟動,分析發現所有該儲存的磁碟分割槽資訊丟失,導致asmlib無法發現磁碟(使用分割槽做asm disk)
類似如下錯誤(磁碟分割槽丟失)
--
fdisk
-l 顯示部分結果
Disk
/dev/mapper/datahds1
: 1099.5 GB, 1099511627776 bytes
255 heads, 63 sectors /track
, 133674 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical
/physical
): 512 bytes / 512 bytes
I
/O
size (minimum
/optimal
): 512 bytes / 512 bytes Disk identifier: 0x00000000
--
ls
-l
/dev/mapper/
顯示結果無分割槽資訊
lrwxrwxrwx 1 root root 7 May 6 03:44 datahds1 -> ..
/dm-1 lrwxrwxrwx 1 root root 7 May 6 03:26 datahds2 -> ..
/dm-3
lrwxrwxrwx 1 root root 7 May 6 03:26 datahds3 -> ..
/dm-8
lrwxrwxrwx 1 root root 7 May 6 03:26 ocrhds1 -> ..
/dm-0
lrwxrwxrwx 1 root root 7 May 6 03:26 ocrhds2 -> ..
/dm-2
lrwxrwxrwx 1 root root 7 May 6 03:26 ocrhds3 -> ..
/dm-4
|
asm日誌顯示
SUCCESS: diskgroup DATADG was mounted
NOTE: Instance updated compatible.asm to 11.2.0.0.0
for
grp 3
SUCCESS: diskgroup OCRHDS was mounted
ORA-15032: not all alterations performed
ORA-15017: diskgroup
"DATA"
cannot be mounted
ORA-15063: ASM discovered an insufficient number of disks
for
diskgroup
"DATA"
|
分析系統日誌
May 6 02:23:27 db2 kernel: sdb: unknown partition table
May 6 02:23:27 db2 kernel: sde: unknown partition table
May 6 02:23:27 db2 kernel: sdc: unknown partition table
May 6 02:23:27 db2 kernel: sdf: unknown partition table
May 6 02:23:27 db2 kernel: sdd: unknown partition table
May 6 02:23:27 db2 kernel: sdj:Dev sdj: unable to
read
RDB block 0
May 6 02:23:27 db2 kernel: unable to
read
partition table
May 6 02:23:27 db2 kernel: sdi: sdi1
May 6 02:23:27 db2 kernel: sdk: sdk1
May 6 02:23:27 db2 kernel: sdg: unknown partition table
May 6 02:23:27 db2 kernel: sdl: sdl1
May 6 02:23:27 db2 kernel: sdm:Dev sdm: unable to
read
RDB block 0
May 6 02:23:27 db2 kernel: unable to
read
partition table
May 6 02:23:27 db2 kernel: sdo:Dev sdo: unable to
read
RDB block 0
May 6 02:23:27 db2 kernel: unable to
read
partition table
May 6 02:23:27 db2 kernel: sdn:Dev sdn: unable to
read
RDB block 0
May 6 02:23:27 db2 kernel: unable to
read
partition table
May 6 02:23:27 db2 kernel: sdp:Dev sdp: unable to
read
RDB block 0
May 6 02:23:27 db2 kernel: unable to
read
partition table
May 6 02:23:27 db2 kernel: sds:Dev sds: unable to
read
RDB block 0
May 6 02:23:27 db2 kernel: unable to
read
partition table
May 6 02:23:27 db2 kernel: sdh:
May 6 02:23:27 db2 kernel: sdt: sdt1
May 6 02:23:27 db2 kernel: sdv:Dev sdv: unable to
read
RDB block 0
May 6 02:23:27 db2 kernel: unable to
read
partition table
May 6 02:23:27 db2 kernel: sdq:Dev sdq: unable to
read
RDB block 0
May 6 02:23:27 db2 kernel: unable to
read
partition table
May 6 02:23:27 db2 kernel: sd 1:0:1:9: [sdr] Very big device. Trying to use READ CAPACITY(16).
May 6 02:23:27 db2 kernel: sdr:Dev sdr: unable to
read
RDB block 0
May 6 02:23:27 db2 kernel: unable to
read
partition table
May 6 02:23:27 db2 kernel: sd 2:0:0:9: [sdab] Very big device. Trying to use READ CAPACITY(16).
May 6 02:23:27 db2 kernel: sdab: unknown partition table
May 6 02:23:27 db2 kernel: sdac: unknown partition table
May 6 02:23:27 db2 kernel: sdw: sdw1
May 6 02:23:27 db2 kernel: sdu:Dev sdu: unable to
read
RDB block 0
May 6 02:23:27 db2 kernel: unable to
read
partition table
May 6 02:23:27 db2 kernel: sdx: sdx1
May 6 02:23:27 db2 kernel: sdy: sdy1
May 6 02:23:27 db2 kernel: sdaa: sdaa1
May 6 02:23:27 db2 kernel: sdz: sdz1
May 6 02:23:27 db2 kernel: sdae: unknown partition table
May 6 02:23:27 db2 kernel: sdaf: unknown partition table
May 6 02:23:27 db2 kernel: sdag: unknown partition table
May 6 02:23:27 db2 kernel: sdai:
May 6 02:23:27 db2 kernel: sdah: unknown partition table
May 6 02:23:27 db2 kernel: sdad: unknown partition table
May 6 02:23:28 db2 mcelog: failed to prefill DIMM database from DMI data
|
這裡錯誤比較明顯unknown partition table,磁碟的分割槽資訊損壞.使用fdisk無法發現分割槽
partprobe也無效
[[email protected] oracle]
# partprobe /dev/mapper/ocrhds3
[[email protected] oracle]
#
[[email protected] oracle]
# ls -l /dev/mapper/ocrhds3*
lrwxrwxrwx 1 root root 7 May 6 07:30
/dev/mapper/ocrhds3
-> ..
/dm-4
|
從尚需資訊看,磁碟的分割槽表資訊應該已經損壞,現在能夠做的,就是希望運氣好,磁碟的分割槽的實際資料沒有損壞
分析磁碟實際分割槽資料
[[email protected] ~]$
dd
if
=
/dev/mapper/datahds1
of=
/tmp/datahds1
.
dd
bs=1024k count=50
[[email protected] ~]$
dd
if
=
/tmp/datahds1
.
dd
of=
/tmp/xff01
.
dd
bs=3225 skip=1
[[email protected] ~]$ kfed
read
/tmp/xff01
.
dd
|
more
kfbh.endian: 1 ; 0x000: 0x01
kfbh.hard: 130 ; 0x001: 0x82
kfbh.
type
: 1 ; 0x002: KFBTYP_DISKHEAD
kfbh.datfmt: 1 ; 0x003: 0x01
kfbh.block.blk: 0 ; 0x004: blk=0
kfbh.block.obj: 2147483648 ; 0x008: disk=0
kfbh.check: 3110278718 ; 0x00c: 0xb963163e
kfbh.fcn.base: 0 ; 0x010: 0x00000000
kfbh.fcn.wrap: 0 ; 0x014: 0x00000000
kfbh.spare1: 0 ; 0x018: 0x00000000
kfbh.spare2: 0 ; 0x01c: 0x00000000
kfdhdb.driver.provstr: ORCLDISKHDSDATA1 ; 0x000: length=16
kfdhdb.driver.reserved[0]: 1146307656 ; 0x008: 0x44534448
kfdhdb.driver.reserved[1]: 826364993 ; 0x00c: 0x31415441
kfdhdb.driver.reserved[2]: 0 ; 0x010: 0x00000000
kfdhdb.driver.reserved[3]: 0 ; 0x014: 0x00000000
kfdhdb.driver.reserved[4]: 0 ; 0x018: 0x00000000
kfdhdb.driver.reserved[5]: 0 ; 0x01c: 0x00000000
kfdhdb.compat: 186646528 ; 0x020: 0x0b200000
kfdhdb.dsknum: 0 ; 0x024: 0x0000
kfdhdb.grptyp: 1 ; 0x026: KFDGTP_EXTERNAL
kfdhdb.hdrsts: 3 ; 0x027: KFDHDR_MEMBER
kfdhdb.dskname: DATADG_0000 ; 0x028: length=11
kfdhdb.grpname: DATADG ; 0x048: length=6
kfdhdb.fgname: DATADG_0000 ; 0x068: length=11
kfdhdb.capname: ; 0x088: length=0
kfdhdb.crestmp.hi: 33050696 ; 0x0a8: HOUR=0x8 DAYS=0x2 MNTH=0x4 YEAR=0x7e1
kfdhdb.crestmp.lo: 3813740544 ; 0x0ac: USEC=0x0 MSEC=0x44 SECS=0x35 MINS=0x38
kfdhdb.mntstmp.hi: 33050701 ; 0x0b0: HOUR=0xd DAYS=0x2 MNTH=0x4 YEAR=0x7e1
kfdhdb.mntstmp.lo: 411385856 ; 0x0b4: USEC=0x0 MSEC=0x150 SECS=0x8 MINS=0x6
通過上述分析,我們可以初步判斷,分割槽磁碟的資訊很可能是好的(因為asm disk header是好的,根據一般的規則從前往後覆蓋,既然header是好的,後面的block被覆蓋的概率非常小)
通過準備新磁碟直接把磁碟分割槽dd到新裝置上
dd
if
=
/dev/mapper/ocrhds1
of=
/dev/mapper/ocrhdsnew1
skip=1 bs=3225
dd
if
=
/dev/mapper/ocrhds2
of=
/dev/mapper/ocrhdsnew2
skip=1 bs=3225
dd
if
=
/dev/mapper/ocrhds3
of=
/dev/mapper/ocrhdsnew3
skip=1 bs=3225
dd
if
=
/dev/mapper/datahds1
of=
/dev/mapper/datahdsnew1
skip=1 bs=3225
dd
if
=
/dev/mapper/datahds2
of=
/dev/mapper/datahdsnew2
skip=1 bs=3225
dd
if
=
/dev/mapper/datahds3
of=
/dev/mapper/datahdsnew3
skip=1 bs=3225
asmlib重新掃描磁碟
[[email protected] disks]
# oracleasm scandisks
Reloading disk partitions:
done
Cleaning any stale ASM disks...
Scanning system
for
ASM disks...
Instantiating disk
"HDSOCR3"
Instantiating disk
"HDSDATA2"
Instantiating disk
"HDSDATA1"
Instantiating disk
"HDSDATA3"
Instantiating disk
"HDSOCR1"
Instantiating disk
"HDSOCR2"
[[email protected] disks]
# ls -ltr
total 0
brw-rw---- 1 grid asmadmin 8, 160 May 6 13:49 HDSOCR3
brw-rw---- 1 grid asmadmin 8, 192 May 6 13:49 HDSDATA2
brw-rw---- 1 grid asmadmin 8, 176 May 6 13:49 HDSDATA1
brw-rw---- 1 grid asmadmin 8, 208 May 6 13:49 HDSDATA3
brw-rw---- 1 grid asmadmin 8, 128 May 6 13:49 HDSOCR1
brw-rw---- 1 grid asmadmin 8, 144 May 6 13:49 HDSOCR2
kfed驗證拷貝的分割槽
[[email protected] tmp]
# /oracle/app/11.2.0/grid_1/bin/kfed read /dev/oracleasm/disks/HDSDATA1
kfbh.endian: 1 ; 0x000: 0x01
kfbh.hard: 130 ; 0x001: 0x82
kfbh.
type
: 1 ; 0x002: KFBTYP_DISKHEAD
kfbh.datfmt: 1 ; 0x003: 0x01
kfbh.block.blk: 0 ; 0x004: blk=0
kfbh.block.obj: 2147483648 ; 0x008: disk=0
kfbh.check: 3110278718 ; 0x00c: 0xb963163e
kfbh.fcn.base: 0 ; 0x010: 0x00000000
kfbh.fcn.wrap: 0 ; 0x014: 0x00000000
kfbh.spare1: 0 ; 0x018: 0x00000000
kfbh.spare2: 0 ; 0x01c: 0x00000000
kfdhdb.driver.provstr: ORCLDISKHDSDATA1 ; 0x000: length=16
kfdhdb.driver.reserved[0]: 1146307656 ; 0x008: 0x44534448
kfdhdb.driver.reserved[1]: 826364993 ; 0x00c: 0x31415441
kfdhdb.driver.reserved[2]: 0 ; 0x010: 0x00000000
kfdhdb.driver.reserved[3]: 0 ; 0x014: 0x00000000
kfdhdb.driver.reserved[4]: 0 ; 0x018: 0x00000000
kfdhdb.driver.reserved[5]: 0 ; 0x01c: 0x00000000
kfdhdb.compat: 186646528 ; 0x020: 0x0b200000
kfdhdb.dsknum: 0 ; 0x024: 0x0000
kfdhdb.grptyp: 1 ; 0x026: KFDGTP_EXTERNAL
kfdhdb.hdrsts: 3 ; 0x027: KFDHDR_MEMBER
kfdhdb.dskname: DATADG_0000 ; 0x028: length=11
kfdhdb.grpname: DATADG ; 0x048: length=6
kfdhdb.fgname: DATADG_0000 ; 0x068: length=11
kfdhdb.capname: ; 0x088: length=0
asm和資料庫啟動正常
[[email protected] ~]$ asmcmd
ASMCMD> lsdg
State Type Rebal Sector Block AU Total_MB Free_MB Req_mir_free_MB Usable_file_MB Offline_disks Voting_files
Name
MOUNTED EXTERN N 512 4096 1048576 3145710 2378034 0 2378034 0 N DATADG/
MOUNTED NORMAL N 512 4096 1048576 15342 14416 5114 4651 0 Y OCRHDS/
ASMCMD>
[[email protected] ~]$ sqlplus /
as
sysdba
SQL*Plus: Release 11.2.0.4.0 Production
on
Sat May 6 13:54:21 2017
Copyright (c) 1982, 2013, Oracle.
All
rights reserved.
Connected
to
an idle instance.
SQL> startup
ORACLE instance started.
Total System
Global
Area 3.6077E+10 bytes
Fixed
Size
2260648 bytes
Variable
Size
7247757656 bytes
Database
Buffers 2.8723E+10 bytes
Redo Buffers 104382464 bytes
Database
mounted.
Database
opened.
SQL>
|
通過上述恢復,實現asm磁碟分割槽丟失資料0丟失
如果您遇到此類情況,無法解決請聯絡我們,提供專業ORACLE資料庫恢復技術支援
Phone:13429648788 Q Q:107644445 E-Mail:[email protected]