kingbaseES R6叢集切換priority為0測試案例
阿新 • • 發佈:2022-01-25
KingbaseES、repmgr、PostgreSQL
案例說明:
在一主多備的架構中,需要配置一臺備庫在主備切換時,不能選舉為主庫。對於repmgr主備切換主庫的選擇演算法如下:
Tips:
Repmgr選舉候選備節點會以以下順序選舉:LSN ---->Priority----> Node_ID。
系統會先選舉一個LSN比較大者作為候選備節點;如LSN一樣,會根據Priority優先順序進行比較,該優先順序是在配置檔案中進行引數配置;如優先順序也一樣,會比較節點的Node ID,小者會優先選舉。在選舉主機過程中,權重高的備機具有升主的更高優先順序,如果權重為0,則該備機永遠不會升級為主機。
repmgr叢集,預設主備庫優先順序都是100:
對於repmgr.conf中的引數priority:
priority=60
權重,在選舉主機過程中,權重高的備機具有升主的更高優先順序,如果權重為0,則該備機永遠不會升級為主機
修改priority引數:
測試案例:
1、切換前叢集節點狀態
[kingbase@node1 bin]$ ./repmgr cluster show ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string ----+---------+---------+-----------+----------+----------+----------+----------+--------------------------------------------------------------------------------------------------------------------------------------------------- 1 | node248 | primary | * running | | default | 100 | 10 | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3 3 | node243 | standby | running | node248 | default | 0 | 10 | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
** 2、修改repmgr.conf中priority引數**
[kingbase@node1 etc]$ vi repmgr.conf
on_bmj=off
.......
priority=0
重新註冊備庫:
[kingbase@node3 bin]$ ./repmgr standby register --force INFO: connecting to local node "node243" (ID: 3) INFO: connecting to primary database INFO: standby registration complete NOTICE: standby node "node243" (ID: 3) successfully registered
在叢集節點狀態資訊中,standby的priority顯示為0:
3、重啟主庫系統做failover切換測試
[root@node1 ~]#reboot
4、在備庫檢視叢集節點狀態
=== 如下所示,叢集未發生切換,只是顯示主庫不能連線===
[kingbase@node3 bin]$ ./repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+---------+---------+---------------+-----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
1 | node248 | primary | ? unreachable | | default | 100 | ? | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
3 | node243 | standby | running | ? node248 | default | 0 | 10 | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
WARNING: following issues were detected
- unable to connect to node "node248" (ID: 1)
- node "node248" (ID: 1) is registered as an active primary but is unreachable
- unable to connect to node "node243" (ID: 3)'s upstream node "node248" (ID: 1)
- unable to determine if node "node243" (ID: 3) is attached to its upstream node "node248" (ID: 1)
5、主庫系統重啟後,啟動資料庫服務
[kingbase@node1 bin]$ ./sys_ctl start -D ../data
waiting for server to start....2021-03-01 12:03:41.870 CST [5257] LOG: sepapower extension initialized
2021-03-01 12:03:41.893 CST [5257] LOG: starting KingbaseES V008R006C003B0010 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46), 64-bit
2021-03-01 12:03:41.893 CST [5257] LOG: listening on IPv4 address "0.0.0.0", port 54321
2021-03-01 12:03:41.893 CST [5257] LOG: listening on IPv6 address "::", port 54321
2021-03-01 12:03:42.030 CST [5257] LOG: listening on Unix socket "/tmp/.s.KINGBASE.54321"
2021-03-01 12:03:42.454 CST [5257] LOG: redirecting log output to logging collector process
2021-03-01 12:03:42.454 CST [5257] HINT: Future log output will appear in directory "sys_log".
. done
server started
6、叢集節點狀態恢復正常
[kingbase@node1 bin]$ ./repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+---------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
1 | node248 | primary | * running | | default | 100 | 10 | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
3 | node243 | standby | running | node248 | default | 0 | 10 | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
7、檢視備庫repmgr.log日誌
=== 提示備庫priority=0 ,不能提升為主庫===
[2021-03-01 12:43:30] [NOTICE] starting monitoring of node "node243" (ID: 3)
[2021-03-01 12:43:30] [INFO] "connection_check_type" set to "ping"
[2021-03-01 12:43:30] [INFO] monitoring connection to upstream node "node248" (ID: 1)
[2021-03-01 12:43:30] [NOTICE] wal catched_up state changed to 1
[2021-03-01 12:45:00] [WARNING] unable to ping "host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3"
[2021-03-01 12:45:00] [DETAIL] PQping() returned "PQPING_REJECT"
[2021-03-01 12:45:00] [WARNING] unable to connect to upstream node "node248" (ID: 1)
[2021-03-01 12:45:00] [INFO] sleeping 5 seconds until next reconnection attempt
[2021-03-01 12:45:05] [INFO] checking state of node 1, 1 of 3 attempts
[2021-03-01 12:45:15] [WARNING] unable to ping "user=esrep connect_timeout=10 dbname=esrep host=192.168.7.248 port=54321 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3 fallback_application_name=repmgr"
[2021-03-01 12:45:15] [DETAIL] PQping() returned "PQPING_NO_RESPONSE"
[2021-03-01 12:45:15] [INFO] sleeping 5 seconds until next reconnection attempt
[2021-03-01 12:45:20] [INFO] checking state of node 1, 2 of 3 attempts
[2021-03-01 12:45:30] [WARNING] unable to ping "user=esrep connect_timeout=10 dbname=esrep host=192.168.7.248 port=54321 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3 fallback_application_name=repmgr"
[2021-03-01 12:45:30] [DETAIL] PQping() returned "PQPING_NO_RESPONSE"
[2021-03-01 12:45:30] [INFO] sleeping 5 seconds until next reconnection attempt
[2021-03-01 12:45:35] [INFO] checking state of node 1, 3 of 3 attempts
[2021-03-01 12:45:45] [WARNING] unable to ping "user=esrep connect_timeout=10 dbname=esrep host=192.168.7.248 port=54321 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3 fallback_application_name=repmgr"
[2021-03-01 12:45:45] [DETAIL] PQping() returned "PQPING_NO_RESPONSE"
[2021-03-01 12:45:45] [WARNING] unable to reconnect to node 1 after 3 attempts
[2021-03-01 12:45:45] [NOTICE] setting "wal_retrieve_retry_interval" to 86405000 milliseconds
[2021-03-01 12:45:45] [INFO] sleeping 5 seconds
[2021-03-01 12:45:50] [NOTICE] killing WAL receiver with PID 14273
[2021-03-01 12:45:51] [INFO] WAL receiver with pid 14273 killed
[2021-03-01 12:45:52] [NOTICE] WAL receiver disconnected on all sibling nodes
[2021-03-01 12:45:52] [INFO] WAL receiver disconnected on all 0 sibling nodes
[2021-03-01 12:45:52] [NOTICE] this node's priority is 0 so will not be considered as an automatic promotion candidate
[2021-03-01 12:45:52] [NOTICE] setting "wal_retrieve_retry_interval" to 5000 ms
[2021-03-01 12:45:52] [INFO] follower node awaiting notification from a candidate node
8、修改備庫優先順序為預設優先順序
1)註釋或刪除priority引數配置
2)重新註冊standby
[kingbase@node3 bin]$ ./repmgr standby register --force
INFO: connecting to local node "node243" (ID: 3)
INFO: connecting to primary database
INFO: standby registration complete
NOTICE: standby node "node243" (ID: 3) successfully registered
3)檢視叢集節點狀態
[kingbase@node3 bin]$ ./repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+---------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
1 | node248 | primary | * running | | default | 100 | 10 | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
3 | node243 | standby | running | node248 | default | 100 | 10 | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3