1. 程式人生 > 其它 >postgresql/lightdb邏輯備份、恢復最佳實踐

postgresql/lightdb邏輯備份、恢復最佳實踐

  lt_dump採用的是ddl和資料分開的方式匯出(雖然資料也支援insert模式,但預設是copy模式,效能最佳,壓縮率最高),支援序列和並行匯出,並行的時候按照物件級級併發(所以採用lightdb部署模式會非常快,資源可以用完)。

  如下:

[zjh@hs-10-20-30-193 lt_dump_data_only]$ lt_dump --help
lt_dump dumps a database as a text file or to other formats.

Usage:
  lt_dump [OPTION]... [DBNAME]

General options:
  -f, --file
=FILENAME output file or directory name -F, --format=c|d|t|p output file format (custom, directory, tar, plain text (default)) -j, --jobs=NUM use this many parallel jobs to dump -v, --verbose verbose mode -V, --version output version information, then
exit -Z, --compress=0-9 compression level for compressed formats --lock-wait-timeout=TIMEOUT fail after waiting TIMEOUT for a table lock --no-sync do not wait for changes to be written safely to disk -?, --help show this help, then exit Options controlling the output content:
-a, --data-only dump only the data, not the schema -b, --blobs include large objects in dump -B, --no-blobs exclude large objects in dump -c, --clean clean (drop) database objects before recreating -C, --create include commands to create database in dump -E, --encoding=ENCODING dump the data in encoding ENCODING -n, --schema=PATTERN dump the specified schema(s) only -N, --exclude-schema=PATTERN do NOT dump the specified schema(s) -O, --no-owner skip restoration of object ownership in plain-text format -s, --schema-only dump only the schema, no data -S, --superuser=NAME superuser user name to use in plain-text format -t, --table=PATTERN dump the specified table(s) only -T, --exclude-table=PATTERN do NOT dump the specified table(s) -x, --no-privileges do not dump privileges (grant/revoke) --binary-upgrade for use by upgrade utilities only --column-inserts dump data as INSERT commands with column names --disable-dollar-quoting disable dollar quoting, use SQL standard quoting --disable-triggers disable triggers during data-only restore --enable-row-security enable row security (dump only content user has access to) --exclude-table-data=PATTERN do NOT dump data for the specified table(s) --extra-float-digits=NUM override default setting for extra_float_digits --if-exists use IF EXISTS when dropping objects --include-foreign-data=PATTERN include data of foreign tables on foreign servers matching PATTERN --inserts dump data as INSERT commands, rather than COPY --load-via-partition-root load partitions via the root table --no-comments do not dump comments --no-publications do not dump publications --no-security-labels do not dump security label assignments --no-subscriptions do not dump subscriptions --no-synchronized-snapshots do not use synchronized snapshots in parallel jobs --no-tablespaces do not dump tablespace assignments --no-unlogged-table-data do not dump unlogged table data --on-conflict-do-nothing add ON CONFLICT DO NOTHING to INSERT commands --quote-all-identifiers quote all identifiers, even if not key words --rows-per-insert=NROWS number of rows per INSERT; implies --inserts --section=SECTION dump named section (pre-data, data, or post-data) --serializable-deferrable wait until the dump can run without anomalies --snapshot=SNAPSHOT use given snapshot for the dump --strict-names require table and/or schema include patterns to match at least one entity each --use-set-session-authorization use SET SESSION AUTHORIZATION commands instead of ALTER OWNER commands to set ownership Connection options: -d, --dbname=DBNAME database to dump -h, --host=HOSTNAME database server host or socket directory -p, --port=PORT database server port number -U, --username=NAME connect as specified database user -w, --no-password never prompt for password -W, --password force password prompt (should happen automatically) --role=ROLENAME do SET ROLE before dump If no database name is supplied, then the PGDATABASE environment variable value is used. Report bugs to <https://github.com/hslightdb>. LightDB home page: <https://www.hs.net/lightdb>

  預設情況下,lt_dump匯出所有物件(包括訂閱、釋出、無日誌表、RLS、表空間等)的定義和資料,並且預設是gzip壓縮模式,在資料重複多的場景中,壓縮率甚至能夠達到50:1。

  在並行模式下,預設情況下,lt_dump匯出物件時採用的是各自快照,並非全域性一致性。不過這個問題不是很大,畢竟備份期間一般不怎麼跑業務。

常規用法

同時匯出DDL和資料

lt_dump -p25432 -f lt_dump_data.dat --no-publications --no-subscriptions --no-unlogged-table-data postgres -n public

lt_dump -p25432 -j 8 -f lt_dump_data.dat --no-publications --no-subscriptions --no-unlogged-table-data postgres -n public

 

如果有一個表超級大,那麼並行匯出的效果就不佳。

匯出檔案的結構也是三個部分

建表語句

copy from stdin;

建索引、外來鍵、主鍵等。

分別匯出DDL和資料

  lt_dump -p25432 -s -f lt_dump_schema_only_data.dat --no-publications --no-subscriptions --no-unlogged-table-data postgres -n public

  -- 一般裡面可能還包括其他物件,以及一些系統extension,所以不建議包含-C -c選項,即database不刪除不重建,不然容易丟失一些三方外掛。因為表結構有可能在lt_dump之後發生了變化,所以表結構應該在匯入的時候重建,而開源PG並不支援表級別recreate的(lightdb將於22.2版本支援--recreate-table選項,以便在匯出的ddl中包含drop table if exists語句)。

  lt_dump -p25432 -j 8 -a -F directory -f lt_dump_data_only --no-publications --no-subscriptions --no-unlogged-table-data postgres -n public

  -- 並行匯出資料本身。並行模式下,檔案格式必須是directory,因為是並行的粒度是檔案(其實就是並行的copy to),預設使用tar.gz壓縮(如果編譯(可通過lt_config檢視)的時候包含了巨集LIBZ,則custom和directory模式預設使用zlib預設壓縮級別6)。

高效的用法(分基本表結構及物件,資料,後置DDL)

  高效、透明的做法建議為將lt_dump分為三個匯出指令碼:DDL,資料庫,後置DDL。

  lt_dump -p25432 --section=pre-data -s -f lt_dump_schema_only_predata.dat --no-publications --no-subscriptions --no-unlogged-table-data postgres -n public

  lt_dump -p25432 --section=data -j 8 -a -F directory -f lt_dump_data --no-publications --no-subscriptions --no-unlogged-table-data postgres -n public   lt_dump -p25432 --section=post-data -s -f lt_dump_schema_only_postdata.dat --no-publications --no-subscriptions --no-unlogged-table-data postgres -n public  

資料匯入

  匯入前效能優化:

work_mem = 32MB
shared_buffers = 4GB
maintenance_work_mem = 2GB
full_page_writes = off
autovacuum = off
wal_buffers = -1
  lt_restore -v -l -p25432 -s -e --no-publications --no-subscriptions --disable-triggers -n public   lt_restore -v -l -p25432 -s -e -j 8 -a -f lt_dump_data --no-publications --no-subscriptions -n public   lt_restore -v -l -p25432 -s -e --no-publications --no-subscriptions --disable-triggers -n public   https://www.postgresql.org/docs/current/performance-tips.html