1. 程式人生 > 其它 >PostgreSQL資料恢復工具——pg_filedump

PostgreSQL資料恢復工具——pg_filedump

技術標籤:PostgreSQLsql資料庫postgresql

0、說明

資料庫難免會遇到因為某些故障導致資料丟失的情況,此時便需要進行資料恢復。一般情況有備份的話可以直接恢復,但是如果恰好沒有備份或者資料檔案出現損壞那就比較麻煩了。

在PostgreSQL中如果只是一般的資料檔案損壞,我們可以直接使用zero_damaged_pages=on來跳過損壞的資料塊來讀取資料,然後將資料導到新表中即可。但是如果元資料都損壞了,資料庫已經無法啟動了呢?這種情況我們便需要通過工具直接從資料檔案中讀取資料,例如Oracle中的DUL、ODU這類的工具。

在pg中,我們可以使用pg_filedump這個工具來實現類似的功能。

1、安裝

安裝很簡單,直接解壓安裝即可。

git clone git://git.postgresql.org/git/pg_filedump.git  
  
cd pg_filedump  
    
make 

make install 

使用說明:

Usage: pg_filedump [-abcdfhikxy] [-R startblock [endblock]] [-D attrlist] [-S blocksize] [-s segsize] [-n segnumber] file

Display formatted contents of a PostgreSQL heap/index/control file
Defaults are: relative addressing, range of the entire file, block size as listed on block 0 in the file The following options are valid for heap and index files: -a Display absolute addresses when formatting (Block header information is always block relative) -b Display binary block images within a range (
Option will turn off all formatting options) -d Display formatted block content dump (Option will turn off all other formatting options) -D Decode tuples using given comma separated list of types Supported types: bigint bigserial bool char charN date float float4 float8 int json macaddr name oid real serial smallint smallserial text time timestamp timetz uuid varchar varcharN xid xml ~ ignores all attributes left in a tuple -f Display formatted block content dump along with interpretation -h Display this information -i Display interpreted item details -k Verify block checksums -o Do not dump old values. -R Display specific block ranges within the file (Blocks are indexed from 0) [startblock]: block to start at [endblock]: block to end at A startblock without an endblock will format the single block -s Force segment size to [segsize] -t Dump TOAST files -v Ouput additional information about TOAST relations -n Force segment number to [segnumber] -S Force block size to [blocksize] -x Force interpreted formatting of block items as index items -y Force interpreted formatting of block items as heap items The following options are valid for control files: -c Interpret the file listed as a control file -f Display formatted content dump along with interpretation -S Force block size to [blocksize] Report bugs to <[email protected]>

2、使用測試

2.1、建立表並插入測試資料

bill@bill=>create table t_dump(id int,info text,crt_time timestamp);
CREATE TABLE

bill@bill=>insert into t_dump select generate_series(1,10),md5(random()::text),clock_timestamp();
INSERT 0 10

bill@bill=>select * from t_dump ;
 id |               info               |          crt_time
----+----------------------------------+----------------------------
  1 | 9af995e95f321e3521fcb6d41208af40 | 2021-02-03 13:26:23.502727
  2 | 2544880c1a22986487e563a6c89f377b | 2021-02-03 13:26:23.502932
  3 | de61423aaf82b8a1bbb49dc3d7809863 | 2021-02-03 13:26:23.502945
  4 | 398af8893872a1860e08ac424ecce885 | 2021-02-03 13:26:23.502951
  5 | 374e4a32688ec70a46fae44fda9e4ed8 | 2021-02-03 13:26:23.502958
  6 | bc8911e89c5be9329abf29cf68f5b4ce | 2021-02-03 13:26:23.502963
  7 | 136bfa992d70eb33b3cdd9e53376261b | 2021-02-03 13:26:23.50297
  8 | ffaa31f1a5ae272727c53ba37dd77706 | 2021-02-03 13:26:23.502975
  9 | e6fe762a144d51e15ea6bdefe2362242 | 2021-02-03 13:26:23.502982
 10 | 8669af9ca762b99e307f9ba2de6d77d2 | 2021-02-03 13:26:23.502988
(10 rows)

2.2、查看錶對應的檔案

bill@bill=>select pg_relation_filepath('t_dump');
 pg_relation_filepath
----------------------
 base/16385/25316
(1 row)

--checkpiont確保資料刷到磁碟
bill@bill=>checkpoint ;
CHECKPOINT

2.3、pg_filedump讀取資料檔案

pg13@cnndr4pptliot-> pg_filedump 25316

*******************************************************************
* PostgreSQL File/Block Formatted Dump Utility
*
* File: 25316
* Options used: None
*******************************************************************

Block    0 ********************************************************
<Header> -----
 Block Offset: 0x00000000         Offsets: Lower      64 (0x0040)
 Block: Size 8192  Version    4            Upper    7472 (0x1d30)
 LSN:  logid      0 recoff 0x7595d5b8      Special  8192 (0x2000)
 Items:   10                      Free Space: 7408
 Checksum: 0x0000  Prune XID: 0x00000000  Flags: 0x0000 ()
 Length (including item array): 64

<Data> -----
 Item   1 -- Length:   72  Offset: 8120 (0x1fb8)  Flags: NORMAL
 Item   2 -- Length:   72  Offset: 8048 (0x1f70)  Flags: NORMAL
 Item   3 -- Length:   72  Offset: 7976 (0x1f28)  Flags: NORMAL
 Item   4 -- Length:   72  Offset: 7904 (0x1ee0)  Flags: NORMAL
 Item   5 -- Length:   72  Offset: 7832 (0x1e98)  Flags: NORMAL
 Item   6 -- Length:   72  Offset: 7760 (0x1e50)  Flags: NORMAL
 Item   7 -- Length:   72  Offset: 7688 (0x1e08)  Flags: NORMAL
 Item   8 -- Length:   72  Offset: 7616 (0x1dc0)  Flags: NORMAL
 Item   9 -- Length:   72  Offset: 7544 (0x1d78)  Flags: NORMAL
 Item  10 -- Length:   72  Offset: 7472 (0x1d30)  Flags: NORMAL


*** End of File Encountered. Last Block Read: 0 ***

不過上面的資訊我們並不能看懂是什麼意思,我們需要使用-D選項將其轉換成可以直觀讀取的格式:

*******************************************************************
* PostgreSQL File/Block Formatted Dump Utility
*
* File: 25316
* Options used: -D int,text,timestamp
*******************************************************************

Block    0 ********************************************************
<Header> -----
 Block Offset: 0x00000000         Offsets: Lower      64 (0x0040)
 Block: Size 8192  Version    4            Upper    7472 (0x1d30)
 LSN:  logid      0 recoff 0x7595d5b8      Special  8192 (0x2000)
 Items:   10                      Free Space: 7408
 Checksum: 0x0000  Prune XID: 0x00000000  Flags: 0x0000 ()
 Length (including item array): 64

<Data> -----
 Item   1 -- Length:   72  Offset: 8120 (0x1fb8)  Flags: NORMAL
COPY: 1 9af995e95f321e3521fcb6d41208af40        2021-02-03 13:26:23.502727
 Item   2 -- Length:   72  Offset: 8048 (0x1f70)  Flags: NORMAL
COPY: 2 2544880c1a22986487e563a6c89f377b        2021-02-03 13:26:23.502932
 Item   3 -- Length:   72  Offset: 7976 (0x1f28)  Flags: NORMAL
COPY: 3 de61423aaf82b8a1bbb49dc3d7809863        2021-02-03 13:26:23.502945
 Item   4 -- Length:   72  Offset: 7904 (0x1ee0)  Flags: NORMAL
COPY: 4 398af8893872a1860e08ac424ecce885        2021-02-03 13:26:23.502951
 Item   5 -- Length:   72  Offset: 7832 (0x1e98)  Flags: NORMAL
COPY: 5 374e4a32688ec70a46fae44fda9e4ed8        2021-02-03 13:26:23.502958
 Item   6 -- Length:   72  Offset: 7760 (0x1e50)  Flags: NORMAL
COPY: 6 bc8911e89c5be9329abf29cf68f5b4ce        2021-02-03 13:26:23.502963
 Item   7 -- Length:   72  Offset: 7688 (0x1e08)  Flags: NORMAL
COPY: 7 136bfa992d70eb33b3cdd9e53376261b        2021-02-03 13:26:23.502970
 Item   8 -- Length:   72  Offset: 7616 (0x1dc0)  Flags: NORMAL
COPY: 8 ffaa31f1a5ae272727c53ba37dd77706        2021-02-03 13:26:23.502975
 Item   9 -- Length:   72  Offset: 7544 (0x1d78)  Flags: NORMAL
COPY: 9 e6fe762a144d51e15ea6bdefe2362242        2021-02-03 13:26:23.502982
 Item  10 -- Length:   72  Offset: 7472 (0x1d30)  Flags: NORMAL
COPY: 10        8669af9ca762b99e307f9ba2de6d77d2        2021-02-03 13:26:23.502988


*** End of File Encountered. Last Block Read: 0 ***

可以看到COPY:XXX部分顯示的便是表中的實際資料了!

但這樣是遠遠不夠的,為什麼呢?假如表中存在dead tuple,這樣顯示出來的結果我們便沒法判斷哪些是需要的了。

bill@bill=>update t_dump set info = 'bill' where id = 1;
UPDATE 1

bill@bill=>checkpoint;
CHECKPOINT

bill@bill=>select * from t_dump;
 id |               info               |          crt_time
----+----------------------------------+----------------------------
  2 | 2544880c1a22986487e563a6c89f377b | 2021-02-03 13:26:23.502932
  3 | de61423aaf82b8a1bbb49dc3d7809863 | 2021-02-03 13:26:23.502945
  4 | 398af8893872a1860e08ac424ecce885 | 2021-02-03 13:26:23.502951
  5 | 374e4a32688ec70a46fae44fda9e4ed8 | 2021-02-03 13:26:23.502958
  6 | bc8911e89c5be9329abf29cf68f5b4ce | 2021-02-03 13:26:23.502963
  7 | 136bfa992d70eb33b3cdd9e53376261b | 2021-02-03 13:26:23.50297
  8 | ffaa31f1a5ae272727c53ba37dd77706 | 2021-02-03 13:26:23.502975
  9 | e6fe762a144d51e15ea6bdefe2362242 | 2021-02-03 13:26:23.502982
 10 | 8669af9ca762b99e307f9ba2de6d77d2 | 2021-02-03 13:26:23.502988
  1 | bill                             | 2021-02-03 13:26:23.502727
(10 rows)

檢視:

[email protected]> pg_filedump -D int,text,timestamp  25316

*******************************************************************
* PostgreSQL File/Block Formatted Dump Utility
*
* File: 25316
* Options used: -D int,text,timestamp
*******************************************************************

Block    0 ********************************************************
<Header> -----
 Block Offset: 0x00000000         Offsets: Lower      68 (0x0044)
 Block: Size 8192  Version    4            Upper    7424 (0x1d00)
 LSN:  logid      0 recoff 0x7595db70      Special  8192 (0x2000)
 Items:   11                      Free Space: 7356
 Checksum: 0x0000  Prune XID: 0x000008d0  Flags: 0x0000 ()
 Length (including item array): 68

<Data> -----
 Item   1 -- Length:   72  Offset: 8120 (0x1fb8)  Flags: NORMAL
COPY: 1 9af995e95f321e3521fcb6d41208af40        2021-02-03 13:26:23.502727
 Item   2 -- Length:   72  Offset: 8048 (0x1f70)  Flags: NORMAL
COPY: 2 2544880c1a22986487e563a6c89f377b        2021-02-03 13:26:23.502932
 Item   3 -- Length:   72  Offset: 7976 (0x1f28)  Flags: NORMAL
COPY: 3 de61423aaf82b8a1bbb49dc3d7809863        2021-02-03 13:26:23.502945
 Item   4 -- Length:   72  Offset: 7904 (0x1ee0)  Flags: NORMAL
COPY: 4 398af8893872a1860e08ac424ecce885        2021-02-03 13:26:23.502951
 Item   5 -- Length:   72  Offset: 7832 (0x1e98)  Flags: NORMAL
COPY: 5 374e4a32688ec70a46fae44fda9e4ed8        2021-02-03 13:26:23.502958
 Item   6 -- Length:   72  Offset: 7760 (0x1e50)  Flags: NORMAL
COPY: 6 bc8911e89c5be9329abf29cf68f5b4ce        2021-02-03 13:26:23.502963
 Item   7 -- Length:   72  Offset: 7688 (0x1e08)  Flags: NORMAL
COPY: 7 136bfa992d70eb33b3cdd9e53376261b        2021-02-03 13:26:23.502970
 Item   8 -- Length:   72  Offset: 7616 (0x1dc0)  Flags: NORMAL
COPY: 8 ffaa31f1a5ae272727c53ba37dd77706        2021-02-03 13:26:23.502975
 Item   9 -- Length:   72  Offset: 7544 (0x1d78)  Flags: NORMAL
COPY: 9 e6fe762a144d51e15ea6bdefe2362242        2021-02-03 13:26:23.502982
 Item  10 -- Length:   72  Offset: 7472 (0x1d30)  Flags: NORMAL
COPY: 10        8669af9ca762b99e307f9ba2de6d77d2        2021-02-03 13:26:23.502988
 Item  11 -- Length:   48  Offset: 7424 (0x1d00)  Flags: NORMAL
COPY: 1 bill    2021-02-03 13:26:23.502727

可以看到dump出來總共有11條記錄,但表中實際只有10條記錄,這是因為存在dead tuple,如果要檢視哪些是dead tuple便需要檢視更詳細的資訊。

[email protected]> pg_filedump -D int,text,timestamp  -i -f 25316|less

...

<Data> -----
 Item   1 -- Length:   72  Offset: 8120 (0x1fb8)  Flags: NORMAL
  XMIN: 2255  XMAX: 2256  CID|XVAC: 0
  Block Id: 0  linp Index: 11   Attributes: 3   Size: 24
  infomask: 0x0502 (HASVARWIDTH|XMIN_COMMITTED|XMAX_COMMITTED|HOT_UPDATED)

  1fb8: cf080000 d0080000 00000000 00000000  ................
  1fc8: 0b000340 02051800 01000000 43396166  ..[email protected]........C9af
  1fd8: 39393565 39356633 32316533 35323166  995e95f321e3521f
  1fe8: 63623664 34313230 38616634 30000000  cb6d41208af40...
  1ff8: 87a9524d 6d5d0200                    ..RMm]..

COPY: 1 9af995e95f321e3521fcb6d41208af40        2021-02-03 13:26:23.502727

可以看到Item1的XMAX是2256,即表示這條資料是被更新的資料,即dead tuple。

關於資料塊的內部結構,可以參考:
https://www.postgresql.org/docs/13/storage-page-layout.html

相關的原始碼標頭檔案:

#include "access/gin_private.h"  
#include "access/gist.h"  
#include "access/hash.h"  
#include "access/htup.h"  
#include "access/htup_details.h"  
#include "access/itup.h"  
#include "access/nbtree.h"  
#include "access/spgist_private.h"  
#include "catalog/pg_control.h"  
#include "storage/bufpage.h"