kingbaseES R6叢集切換priority為0測試案例

阿新 • • 發佈：2022-01-25

KingbaseES、repmgr、PostgreSQL

案例說明：

在一主多備的架構中，需要配置一臺備庫在主備切換時，不能選舉為主庫。對於repmgr主備切換主庫的選擇演算法如下：

Tips：
Repmgr選舉候選備節點會以以下順序選舉：LSN ---->Priority----> Node_ID。
系統會先選舉一個LSN比較大者作為候選備節點；如LSN一樣，會根據Priority優先順序進行比較，該優先順序是在配置檔案中進行引數配置；如優先順序也一樣，會比較節點的Node ID，小者會優先選舉。在選舉主機過程中，權重高的備機具有升主的更高優先順序，如果權重為0，則該備機永遠不會升級為主機。

repmgr叢集，預設主備庫優先順序都是100：

對於repmgr.conf中的引數priority：
priority=60

權重，在選舉主機過程中，權重高的備機具有升主的更高優先順序，如果權重為0，則該備機永遠不會升級為主機

修改priority引數：

測試案例：

1、切換前叢集節點狀態

[kingbase@node1 bin]$ ./repmgr cluster show
 ID | Name    | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string                                                                                                                                
----+---------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
 1  | node248 | primary | * running |          | default  | 100      | 10       | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
 3  | node243 | standby |   running | node248  | default  | 0        | 10       | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3

** 2、修改repmgr.conf中priority引數**

 [kingbase@node1 etc]$ vi repmgr.conf 
on_bmj=off
.......
priority=0

重新註冊備庫：

[kingbase@node3 bin]$ ./repmgr standby register --force
INFO: connecting to local node "node243" (ID: 3)
INFO: connecting to primary database
INFO: standby registration complete
NOTICE: standby node "node243" (ID: 3) successfully registered

在叢集節點狀態資訊中，standby的priority顯示為0：

3、重啟主庫系統做failover切換測試
[root@node1 ~]#reboot

4、在備庫檢視叢集節點狀態

=== 如下所示，叢集未發生切換,只是顯示主庫不能連線===

[kingbase@node3 bin]$ ./repmgr cluster show
 ID | Name    | Role    | Status        | Upstream  | Location | Priority | Timeline | Connection string                                                                                                                                
----+---------+---------+---------------+-----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
 1  | node248 | primary | ? unreachable |           | default  | 100      | ?        | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
 3  | node243 | standby |   running     | ? node248 | default  | 0        | 10       | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3

WARNING: following issues were detected
  - unable to connect to node "node248" (ID: 1)
  - node "node248" (ID: 1) is registered as an active primary but is unreachable
  - unable to connect to node "node243" (ID: 3)'s upstream node "node248" (ID: 1)
  - unable to determine if node "node243" (ID: 3) is attached to its upstream node "node248" (ID: 1)

5、主庫系統重啟後，啟動資料庫服務

[kingbase@node1 bin]$ ./sys_ctl start -D ../data
waiting for server to start....2021-03-01 12:03:41.870 CST [5257] LOG:  sepapower extension initialized
2021-03-01 12:03:41.893 CST [5257] LOG:  starting KingbaseES V008R006C003B0010 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46), 64-bit
2021-03-01 12:03:41.893 CST [5257] LOG:  listening on IPv4 address "0.0.0.0", port 54321
2021-03-01 12:03:41.893 CST [5257] LOG:  listening on IPv6 address "::", port 54321
2021-03-01 12:03:42.030 CST [5257] LOG:  listening on Unix socket "/tmp/.s.KINGBASE.54321"
2021-03-01 12:03:42.454 CST [5257] LOG:  redirecting log output to logging collector process
2021-03-01 12:03:42.454 CST [5257] HINT:  Future log output will appear in directory "sys_log".
. done
server started

6、叢集節點狀態恢復正常

[kingbase@node1 bin]$ ./repmgr cluster show
 ID | Name    | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string                                                                                                                                
----+---------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
 1  | node248 | primary | * running |          | default  | 100      | 10       | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
 3  | node243 | standby |   running | node248  | default  | 0        | 10       | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3

7、檢視備庫repmgr.log日誌

=== 提示備庫priority=0 ，不能提升為主庫===

[2021-03-01 12:43:30] [NOTICE] starting monitoring of node "node243" (ID: 3)
[2021-03-01 12:43:30] [INFO] "connection_check_type" set to "ping"
[2021-03-01 12:43:30] [INFO] monitoring connection to upstream node "node248" (ID: 1)
[2021-03-01 12:43:30] [NOTICE] wal catched_up state changed to 1
[2021-03-01 12:45:00] [WARNING] unable to ping "host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3"
[2021-03-01 12:45:00] [DETAIL] PQping() returned "PQPING_REJECT"
[2021-03-01 12:45:00] [WARNING] unable to connect to upstream node "node248" (ID: 1)
[2021-03-01 12:45:00] [INFO] sleeping 5 seconds until next reconnection attempt
[2021-03-01 12:45:05] [INFO] checking state of node 1, 1 of 3 attempts
[2021-03-01 12:45:15] [WARNING] unable to ping "user=esrep connect_timeout=10 dbname=esrep host=192.168.7.248 port=54321 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3 fallback_application_name=repmgr"
[2021-03-01 12:45:15] [DETAIL] PQping() returned "PQPING_NO_RESPONSE"
[2021-03-01 12:45:15] [INFO] sleeping 5 seconds until next reconnection attempt
[2021-03-01 12:45:20] [INFO] checking state of node 1, 2 of 3 attempts
[2021-03-01 12:45:30] [WARNING] unable to ping "user=esrep connect_timeout=10 dbname=esrep host=192.168.7.248 port=54321 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3 fallback_application_name=repmgr"
[2021-03-01 12:45:30] [DETAIL] PQping() returned "PQPING_NO_RESPONSE"
[2021-03-01 12:45:30] [INFO] sleeping 5 seconds until next reconnection attempt
[2021-03-01 12:45:35] [INFO] checking state of node 1, 3 of 3 attempts
[2021-03-01 12:45:45] [WARNING] unable to ping "user=esrep connect_timeout=10 dbname=esrep host=192.168.7.248 port=54321 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3 fallback_application_name=repmgr"
[2021-03-01 12:45:45] [DETAIL] PQping() returned "PQPING_NO_RESPONSE"
[2021-03-01 12:45:45] [WARNING] unable to reconnect to node 1 after 3 attempts
[2021-03-01 12:45:45] [NOTICE] setting "wal_retrieve_retry_interval" to 86405000 milliseconds
[2021-03-01 12:45:45] [INFO] sleeping 5 seconds
[2021-03-01 12:45:50] [NOTICE] killing WAL receiver with PID 14273
[2021-03-01 12:45:51] [INFO] WAL receiver with pid 14273 killed
[2021-03-01 12:45:52] [NOTICE] WAL receiver disconnected on all sibling nodes
[2021-03-01 12:45:52] [INFO] WAL receiver disconnected on all 0 sibling nodes
[2021-03-01 12:45:52] [NOTICE] this node's priority is 0 so will not be considered as an automatic promotion candidate
[2021-03-01 12:45:52] [NOTICE] setting "wal_retrieve_retry_interval" to 5000 ms
[2021-03-01 12:45:52] [INFO] follower node awaiting notification from a candidate node

8、修改備庫優先順序為預設優先順序

1）註釋或刪除priority引數配置

2）重新註冊standby

[kingbase@node3 bin]$ ./repmgr standby register --force
INFO: connecting to local node "node243" (ID: 3)
INFO: connecting to primary database
INFO: standby registration complete
NOTICE: standby node "node243" (ID: 3) successfully registered

3）檢視叢集節點狀態

[kingbase@node3 bin]$ ./repmgr cluster show
 ID | Name    | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string                                                                                                                                
----+---------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
 1  | node248 | primary | * running |          | default  | 100      | 10       | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
 3  | node243 | standby |   running | node248  | default  | 100      | 10       | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3

kingbaseES R6叢集切換priority為0測試案例

kingbaseES R6叢集切換priority為0測試案例

kingbaseES R6 叢集手工切換案例

KingbaseES R6叢集修改data目錄測試案例

KingbaseES R6叢集repmgr.conf引數'recovery'測試案例(二)

KingbaseES R6 叢集 recovery 引數對切換的影響

KingbaseES R6 叢集repmgr.conf引數'recovery'測試案例(一)

KingbaseES R6 叢集repmgr.conf引數'recovery'測試案例(三)

KingbaseES R6 叢集repmgr.conf引數'recovery'測試案例(二)

KingbaseES R6叢集主機鎖衝突導致的主備切換案例

KingbaseES R6叢集主庫網絡卡down測試案例

KingbaseES R6叢集備庫網絡卡down測試案例

KingbaseES R6 叢集主機鎖衝突導致的主備切換案例

KingbaseES R6 叢集備庫網絡卡down測試案例

KingbaseES R6叢集一鍵修改叢集和資料庫引數測試案例

KingbaseES R6 叢集一鍵修改叢集和資料庫引數測試案例

kingbaseES R6 叢集“雙主”故障解決案例

KingbaseES R6叢集手工配置vip案例

KingbaseES R6叢集通過備庫clone線上新增新節點

KingbaseES R3叢集cluster日誌切割和清理案例

KingbaseES R3 叢集cluster日誌切割和清理案例

kingbaseES R6叢集切換priority為0測試案例

相關推薦