KingbaseES V8R6 手工建立主備流複製叢集案例
案例說明:
KingbaseES V8R6部署一般可採用圖形化方式快速部署,但在生產一線,有的伺服器系統未啟用圖形化環境,所以對於KingbaseES V8R6的叢集需採用手工字元介面方式部署,本次文件記錄了在一線環境下的字元介面部署操作步驟。
1)本案例在通用機環境下完成。
2)需要首先安裝KingbaseES R6 cluster版本的軟體包。
3)本案例主要用於系統環境不能提供圖形化部署或者圖形化部署中出現故障時。
4)本案例在通用機環境完成,專用機環境可用於參考。
5)通用機環境的操作基本由kingbase使用者完成。
6)在通過指令碼一鍵部署R6叢集時,請先做好系統環境的準備工作:(如ssh信任關係、防火牆、selinux配置、程序資源管理配置、使用者建立、ip分配等)。
一、 系統環境
1.1 叢集架構
1.2 資料庫版本
KingbaseES_V008R006C003B0062_Aarch64
1.3 系統CPU架構(鯤鵬920)
[root@ECOLABAPP37 ~]# lscpu
Architecture: aarch64
CPU op-mode(s): 64-bit
Byte Order: Little Endian
CPU(s): 192
On-line CPU(s) list: 0-191
Thread(s) per core: 1
Core(s) per socket: 48
Socket(s): 4
NUMA node(s): 8
Vendor ID: HiSilicon
Model: 0
Model name: Kunpeng-920
Stepping: 0x1
CPU max MHz: 3000.0000
CPU min MHz: 200.0000
......
1.4 系統記憶體資訊
[root@ECOLABAPP37 ~]# free -m
total used free shared buff/cache
available
Mem: 522103 18400 501575 63 2127
501458
Swap: 65535 0 65535
1.5 網絡卡資訊
nm-bond: flags=5187<UP,BROADCAST,RUNNING,MASTER,MULTICAST> mtu 1500
inet 10.248.52.* netmask 255.255.240.0 broadcast 10.248.63.255
inet6 fe80::1728:3b0b:9694:6c2c prefixlen 64 scopeid 0x20<link>
ether 00:07:45:c2:d1:20 txqueuelen 1000 (Ethernet)
RX packets 83667032 bytes 5305257118 (4.9 GiB)
RX errors 0 dropped 16629 overruns 0 frame 0
TX packets 513509 bytes 44561399 (42.4 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
1.6 系統核心資訊
[root@ECOLABAPP37 ~]# uname -a
Linux ECOLABAPP37 4.19.90-2003.4.0.0036.oe1.aarch64 #1 SMP Mon Mar 23 19:06:43 UTC 2020 aarch64 aarch64 aarch64 GNU/Linux
二、 配置系統環境(all nodes)
2.1 建立kingbase使用者
[root@ECOLABAPP37 ~]# id kingbase
uid=1002(kingbase) gid=1002(kingbase) groups=1002(kingbase)
2.2 關閉主機系統防火牆
[root@ECOLABAPP37 Scripts]# systemctl stop firewalld
[root@ECOLABAPP37 Scripts]# systemctl disable firewalld
[root@ECOLABAPP38 ~]# systemctl status firewalld
firewalld.service - firewalld - dynamic firewall daemon
Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor preset: enabled)
Active: inactive (dead)
Docs: man:firewalld(1)
……
2.3 配置selinux
[kingbase@node3 ~]$ cat /etc/sysconfig/selinux |grep -v ^#|grep -v ^$
SELINUXTYPE=targeted
SELINUX=disabled
三、 通過指令碼構建叢集
3.1 配置部署環境
Kingbase使用者在宿主目錄下建立資料夾:
[kingbase@ECOLABAPP37 ~] mkdir R6_install
將部署指令碼、配置檔案及資料庫license.dat檔案放置到當前目錄下。
[kingbase@ECOLABAPP37 ~]$ cd R6_install/
[kingbase@ECOLABAPP37 R6_install]$ ls -lh
total 80K
-rw------- 1 kingbase kingbase 5.0K Apr 19 17:28 install.conf
-rw-r--r-- 1 kingbase kingbase 2.9K Apr 19 17:20 license.dat
-r-xr-xr-x 1 kingbase kingbase 2.1K Apr 19 16:57 trust_cluster.sh
-rw------- 1 kingbase kingbase 32K Apr 19 16:57 V8R6_cluster_install.sh
-rw------- 1 kingbase kingbase 31K Apr 19 16:57 V8R6一鍵部署叢集指令碼操作手冊.docx
1) 檢視和編輯叢集配置檔案(根據系統環境進行修改)
[kingbase@node3 ~]$ cat install.conf |grep -v ^#|grep -v ^$
on_bmj=0
all_ip=(10.248.52.165 10.248.52.166)
install_dir="/home/kingbase/cluster"
zip_package="/opt/Kingbase/ES/V8/DeployTools/zip/Aarch64/db.zip"
license_file=(license.dat)
db_user="system" # the user name of database
db_password="123456" # the password of database
db_port="54321" # the port of database, defaults is 54321
db_mode="oracle" # database mode: pg, oracle
db_auth="scram-sha-256" # database authority: scram-sha-256, md5, default is scram-sha-256
trusted_servers="10.248.48.1"
virtual_ip="10.248.52.174/20"
net_device=(nm-bond nm-bond)
ipaddr_path="/sbin"
arping_path="/usr/sbin"
ping_path="/bin"
super_user="root"
execute_user="kingbase"
reconnect_attempts="6" # the number of retries in the event of an error
reconnect_interval="10" # retry interval
recovery="manual" # the way of cluster recovery: automatic/manual
ssh_port="22" # the port of ssh, default is 22
2) 配置主機間ssh互信(可以手工配置,也可以通過以下指令碼配置,建議手工配置)
注意:
需要配置
kingbase
使用者之間、
root
使用者之間、
kingbase
和
root
使用者之間,配置完成後檢查使用者信任關係
檢視指令碼內容(部分內容):
[kingbase@ECOLABAPP37 R6_install]$ cat trust_cluster.sh
#!/bin/bash
# you should change two parameters: general_user and all_ip
# general_user is the general user which you want to config SSH password free
# all_ip is the devices that you want to config SSH password free
shell_folder=$(dirname $(readlink -f "$0"))
install_conf="${shell_folder}/install.conf"
primary_host=""
curren_user=`whoami`
......
for ips in ${all_ip[@]}
do
ssh -p ${ssh_port} root@$ips "cp -r /root/.ssh /home/$general_user/"
ssh -p ${ssh_port} root@$ips "chmod 700 /home/$general_user/.ssh/"
ssh -p ${ssh_port} root@$ips "chown -R $general_user:$general_user /home/$general_user/.ssh/"
done
3)檢視cluser部署指令碼(部分內容)
[kingbase@ECOLABAPP37 R6_install]$ cat V8R6_cluster_install.sh
#!/bin/bash
shell_folder=$(dirname $(readlink -f "$0"))
install_conf=""
#all_ip=(192.168.28.10 192.168.28.11)
all_ip=()
#install_dir="/home/kingbase/tmp_kingbase"
install_dir=""
#zip_package="${shell_folder}/db.zip"
zip_package=""
#license_path="${shell_folder}"
license_path="${shell_folder}"
#BMJ Kingbase install path
soft_dir="/opt/Kingbase/ES/V8/Server"
......
# start up the cluster
echo "[INSTALL] start up the whole cluster ..."
execute_command ${execute_user} ${all_ip[-2]} "${sys_bindir}/sys_monitor.sh start"
[ $? -ne 0 ] && exit 1
echo "[INSTALL] start up the whole cluster ... OK"
}
main
exit 0
3.2 執行指令碼部署
注意:
必須將
license.dat
檔案也存放到當前目錄下,缺少
license.dat
將會出現錯誤。
當前叢集手工部署檔案儲存目錄:
[kingbase@ECOLABAPP37 ~]$ cd R6_install/
[kingbase@ECOLABAPP37 R6_install]$ ls -lh
total 80K
-rw------- 1 kingbase kingbase 5.0K Apr 19 17:28 install.conf
-rw-r--r-- 1 kingbase kingbase 2.9K Apr 19 17:20 license.dat
-r-xr-xr-x 1 kingbase kingbase 2.1K Apr 19 16:57 trust_cluster.sh
-rw------- 1 kingbase kingbase 32K Apr 19 16:57 V8R6_cluster_install.sh
-rw------- 1 kingbase kingbase 31K Apr 19 16:57 V8R6一鍵部署叢集指令碼操作手冊.docx
執行部署指令碼:
根據輸出日誌資訊,判斷部署過程中的故障。完整閱讀輸出日誌,結合圖形化部署工具,可以加深repmgr叢集部署的工作機制。
[kingbase@ECOLABAPP37 ~]$ cd R6_install/
[kingbase@ECOLABAPP37 R6_install]$ sh V8R6_cluster_install.sh
[CONFIG_CHECK] file format is correct ... OK
[CONFIG_CHECK] check if the virtual ip "10.248.52.*" already exist ...
[CONFIG_CHECK] there is no "10.248.52.*" on any host, OK
[CONFIG_CHECK] the number of net_device matches the length of all_ip or the number of net_device is 1 ... OK
[CONFIG_CHECK] the number of license_num matches the length of all_ip or the number of license_num is 1 ... OK
[RUNNING] check if the host can be reached ...
[RUNNING] success connect to the target "10.248.52.*" ..... OK
[RUNNING] success connect to the target "10.248.52.*" ..... OK
[RUNNING] check the db is running or not...
[RUNNING] the db is not running on "10.248.52.*:54321" ..... OK
[RUNNING] the db is not running on "10.248.52.*:54321" ..... OK
[RUNNING] check if the install dir is already exist ...
[RUNNING] the install dir is not exist on "10.248.52.*" ..... OK
[RUNNING] the install dir is not exist on "10.248.52.*" ..... OK
[INSTALL] create the install dir "/home/kingbase/cluster/kingbase" on every host ...
[INSTALL] success to create the install dir "/home/kingbase/cluster/kingbase" on "10.248.52.*" ..... OK
[INSTALL] success to create the install dir "/home/kingbase/cluster/kingbase" on "10.248.52.*" ..... OK
[INSTALL] decompress the "/opt/Kingbase/ES/V8/DeployTools/zip/Aarch64/db.zip" to "/home/kingbase/cluster/kingbase"
[INSTALL] success to decompress the "/opt/Kingbase/ES/V8/DeployTools/zip/Aarch64/db.zip" to "/home/kingbase/cluster/kingbase" on "10.248.52.*"..... OK
[INSTALL] create the dir "/home/kingbase/cluster/kingbase/etc" on all host
[INSTALL] scp the dir "/home/kingbase/cluster/kingbase" to other host
[INSTALL] try to copy the install dir "/home/kingbase/cluster/kingbase" to "10.248.52.*" .....
[INSTALL] success to scp the install dir "/home/kingbase/cluster/kingbase" to "10.248.52.*" ..... OK
[RUNNING] chmod u+s for "/sbin" and "/home/kingbase/cluster/kingbase/bin"
[RUNNING] chmod u+s /sbin/ip on "10.248.52.*" ..... OK
[RUNNING] chmod u+s /home/kingbase/cluster/kingbase/bin/arping on "10.248.52.*" ..... OK
[RUNNING] chmod u+s /sbin/ip on "10.248.52.*" ..... OK
[RUNNING] chmod u+s /home/kingbase/cluster/kingbase/bin/arping on "10.248.52.*" ..... OK
[INSTALL] check license_file "license.dat"
[INSTALL] success to access license_file: /home/kingbase/R6_install/license.dat
[INSTALL] Copy license to /home/kingbase/cluster/kingbase/../: license.dat
[INSTALL] success to copy /home/kingbase/R6_install/license.dat to /home/kingbase/cluster/kingbase/../ on 10.248.52.*
[INSTALL] check license_file "license.dat"
[INSTALL] success to access license_file: /home/kingbase/R6_install/license.dat
[INSTALL] Copy license to /home/kingbase/cluster/kingbase/../: license.dat
[INSTALL] success to copy /home/kingbase/R6_install/license.dat to /home/kingbase/cluster/kingbase/../ on 10.248.52.*
[INSTALL] begin to init the database on "10.248.52.*" ...
The files belonging to this database system will be owned by user "kingbase".
This user must also own the server process.
The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".
Data page checksums are disabled.
creating directory /home/kingbase/cluster/kingbase/data ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... Asia/Shanghai
creating configuration files ... ok
Begin setup encrypt device
initializing the encrypt device ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
create security database ... ok
load security database ... ok
create initial audit rules ... ok
syncing data to disk ... ok
Success. You can now start the database server using:
/home/kingbase/cluster/kingbase/bin/sys_ctl -D /home/kingbase/cluster/kingbase/data -l logfile start
[INSTALL] end to init the database on "10.248.52.*" ... OK
[INSTALL] wirte the kingbase.conf on "10.248.52.*" ...
[INSTALL] wirte the kingbase.conf on "10.248.52.*" ... OK
[INSTALL] wirte the es_rep.conf on "10.248.52.*" ...
[INSTALL] wirte the es_rep.conf on "10.248.52.*" ... OK
[INSTALL] wirte the sys_hba.conf on "10.248.52.*" ...
[INSTALL] wirte the sys_hba.conf on "10.248.52.*" ... OK
[INSTALL] wirte the .encpwd on every host
[INSTALL] write the repmgr.conf on every host
[INSTALL] write the repmgr.conf on "10.248.52.*" ...
[INSTALL] write the repmgr.conf on "10.248.52.*" ... OK
[INSTALL] write the repmgr.conf on "10.248.52.*" ...
[INSTALL] write the repmgr.conf on "10.248.52.*" ... OK
[INSTALL] start up the database on "10.248.52.*" ...
[INSTALL] /home/kingbase/cluster/kingbase/bin/sys_ctl -w -t 60 -l /home/kingbase/cluster/kingbase/logfile -D /home/kingbase/cluster/kingbase/data start
waiting for server to start.... done
server started
[INSTALL] start up the database on "10.248.52.*" ... OK
[INSTALL] create the database "esrep" and user "esrep" for repmgr ...
CREATE DATABASE
CREATE ROLE
[INSTALL] create the database "esrep" and user "esrep" for repmgr ... OK
[INSTALL] register the primary on "10.248.52.*" ...
INFO: connecting to primary database...
NOTICE: attempting to install extension "repmgr"
NOTICE: "repmgr" extension successfully installed
NOTICE: PING 10.248.52.* (10.248.52.*) 56(84) bytes of data.
--- 10.248.52.* ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 1005ms
WARNING: ping host"10.248.52.*" failed
DETAIL: average RTT value is not greater than zero
INFO: loadvip result: 1, arping result: 1
NOTICE: node (ID: 1) acquire the virtual ip 10.248.52.* success
NOTICE: primary node record (ID: 1) registered
[INSTALL] register the primary on "10.248.52.*" ... OK
[INSTALL] clone and start up the standby ...
clone the standby on "10.248.52.*" ...
/home/kingbase/cluster/kingbase/bin/repmgr -h 10.248.52.* -U esrep -d esrep -p 54321 standby clone
NOTICE: destination directory "/home/kingbase/cluster/kingbase/data" provided
INFO: connecting to source node
DETAIL: connection string is: host=10.248.52.* user=esrep port=54321 dbname=esrep
DETAIL: current installation size is 64 MB
NOTICE: checking for available walsenders on the source node (2 required)
NOTICE: checking replication connections can be made to the source server (2 required)
INFO: creating directory "/home/kingbase/cluster/kingbase/data"...
NOTICE: starting backup (using sys_basebackup)...
HINT: this may take some time; consider using the -c/--fast-checkpoint option
INFO: executing:
/home/kingbase/cluster/kingbase/bin/sys_basebackup -l "repmgr base backup" -D /home/kingbase/cluster/kingbase/data -h 10.248.52.* -p 54321 -U esrep -X stream -S repmgr_slot_2
NOTICE: standby clone (using sys_basebackup) complete
NOTICE: you can now start your Kingbase server
HINT: for example: sys_ctl -D /home/kingbase/cluster/kingbase/data start
HINT: after starting the server, you need to register this standby with "repmgr standby register"
clone the standby on "10.248.52.*" ... OK
start up the standby on "10.248.52.*" ...
/home/kingbase/cluster/kingbase/bin/sys_ctl -w -t 60 -l /home/kingbase/cluster/kingbase/logfile -D /home/kingbase/cluster/kingbase/data start
waiting for server to start.... done
server started
start up the standby on "10.248.52.*" ... OK
register the standby on "10.248.52.*" ...
INFO: connecting to local node "node2" (ID: 2)
INFO: connecting to primary database
WARNING: --upstream-node-id not supplied, assuming upstream node is primary (node ID 1)
INFO: standby registration complete
NOTICE: standby node "node2" (ID: 2) successfully registered
[INSTALL] register the standby on "10.248.52.*" ... OK
[INSTALL] start up the whole cluster ...
2021-04-19 17:31:58 Ready to start all DB ...
2021-04-19 17:31:58 begin to start DB on "[10.248.52.*]".
2021-04-19 17:31:59 DB on "[10.248.52.*]" already started, connect to check it.
2021-04-19 17:32:00 DB on "[10.248.52.*]" start success.
2021-04-19 17:32:00 Try to ping trusted_servers on host 10.248.52.* ...
2021-04-19 17:32:02 Try to ping trusted_servers on host 10.248.52.* ...
2021-04-19 17:32:05 begin to start DB on "[10.248.52.*]".
2021-04-19 17:32:05 DB on "[10.248.52.*]" already started, connect to check it.
2021-04-19 17:32:06 DB on "[10.248.52.*]" start success.
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+-------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
1 | node1 | primary | * running | | default | 100 | 1 | user=esrep dbname=esrep port=54321 host=10.248.52.* connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
2 | node2 | standby | running | node1 | default | 100 | 1 | user=esrep dbname=esrep port=54321 host=10.248.52.* connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
2021-04-19 17:32:06 The primary DB is started.
2021-04-19 17:32:12 Success to load virtual ip [10.248.52.*] on primary host [10.248.52.*].
2021-04-19 17:32:12 Try to ping vip on host 10.248.52.* ...
2021-04-19 17:32:14 Try to ping vip on host 10.248.52.* ...
2021-04-19 17:32:17 begin to start repmgrd on "[10.248.52.*]".
[2021-04-19 17:32:17] [NOTICE] using provided configuration file "/home/kingbase/cluster/kingbase/bin/../etc/repmgr.conf"
[2021-04-19 17:32:17] [NOTICE] redirecting logging output to "/home/kingbase/cluster/kingbase/hamgr.log"
2021-04-19 17:32:17 repmgrd on "[10.248.52.*]" start success.
2021-04-19 17:32:17 begin to start repmgrd on "[10.248.52.*]".
[2021-04-19 15:15:45] [NOTICE] using provided configuration file "/home/kingbase/cluster/kingbase/bin/../etc/repmgr.conf"
[2021-04-19 15:15:45] [NOTICE] redirecting logging output to "/home/kingbase/cluster/kingbase/hamgr.log"
2021-04-19 17:32:18 repmgrd on "[10.248.52.*]" start success.
ID | Name | Role | Status | Upstream | repmgrd | PID | Paused? | Upstream last seen
----+-------+---------+-----------+----------+---------+-------+---------+--------------------
1 | node1 | primary | * running | | running | 62956 | no | n/a
2 | node2 | standby | running | node1 | running | 25769 | no | 0 second(s) ago
/home/kingbase/cluster/kingbase/bin/../etc/all_nodes_tools.conf does not exist
/home/kingbase/cluster/kingbase/bin/../etc/all_nodes_tools.conf does not exist
2021-04-19 17:32:22 Done.
[INSTALL] start up the whole cluster ... OKSuccess. You can now start the database server using:
/home/kingbase/cluster/kingbase/bin/sys_ctl -D /home/kingbase/cluster/kingbase/data -l logfile start
[INSTALL] end to init the database on "10.248.52.*" ... OK
[INSTALL] wirte the kingbase.conf on "10.248.52.*" ...
[INSTALL] wirte the kingbase.conf on "10.248.52.*" ... OK
[INSTALL] wirte the es_rep.conf on "10.248.52.*" ...
[INSTALL] wirte the es_rep.conf on "10.248.52.*" ... OK
[INSTALL] wirte the sys_hba.conf on "10.248.52.*" ...
[INSTALL] wirte the sys_hba.conf on "10.248.52.*" ... OK
[INSTALL] wirte the .encpwd on every host
[INSTALL] write the repmgr.conf on every host
[INSTALL] write the repmgr.conf on "10.248.52.*" ...
[INSTALL] write the repmgr.conf on "10.248.52.*" ... OK
[INSTALL] write the repmgr.conf on "10.248.52.*" ...
[INSTALL] write the repmgr.conf on "10.248.52.*" ... OK
[INSTALL] start up the database on "10.248.52.*" ...
[INSTALL] /home/kingbase/cluster/kingbase/bin/sys_ctl -w -t 60 -l /home/kingbase/cluster/kingbase/logfile -D /home/kingbase/cluster/kingbase/data start
waiting for server to start.... done
server started
[INSTALL] start up the database on "10.248.52.*" ... OK
[INSTALL] create the database "esrep" and user "esrep" for repmgr ...
CREATE DATABASE
CREATE ROLE
[INSTALL] create the database "esrep" and user "esrep" for repmgr ... OK
[INSTALL] register the primary on "10.248.52.*" ...
INFO: connecting to primary database...
NOTICE: attempting to install extension "repmgr"
NOTICE: "repmgr" extension successfully installed
NOTICE: PING 10.248.52.* (10.248.52.*) 56(84) bytes of data.
--- 10.248.52.* ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 1005ms
WARNING: ping host"10.248.52.*" failed
DETAIL: average RTT value is not greater than zero
INFO: loadvip result: 1, arping result: 1
NOTICE: node (ID: 1) acquire the virtual ip 10.248.52.* success
NOTICE: primary node record (ID: 1) registered
[INSTALL] register the primary on "10.248.52.*" ... OK
[INSTALL] clone and start up the standby ...
clone the standby on "10.248.52.*" ...
/home/kingbase/cluster/kingbase/bin/repmgr -h 10.248.52.* -U esrep -d esrep -p 54321 standby clone
NOTICE: destination directory "/home/kingbase/cluster/kingbase/data" provided
INFO: connecting to source node
DETAIL: connection string is: host=10.248.52.* user=esrep port=54321 dbname=esrep
DETAIL: current installation size is 64 MB
NOTICE: checking for available walsenders on the source node (2 required)
NOTICE: checking replication connections can be made to the source server (2 required)
INFO: creating directory "/home/kingbase/cluster/kingbase/data"...
NOTICE: starting backup (using sys_basebackup)...
HINT: this may take some time; consider using the -c/--fast-checkpoint option
INFO: executing:
/home/kingbase/cluster/kingbase/bin/sys_basebackup -l "repmgr base backup" -D /home/kingbase/cluster/kingbase/data -h 10.248.52.* -p 54321 -U esrep -X stream -S repmgr_slot_2
NOTICE: standby clone (using sys_basebackup) complete
NOTICE: you can now start your Kingbase server
HINT: for example: sys_ctl -D /home/kingbase/cluster/kingbase/data start
HINT: after starting the server, you need to register this standby with "repmgr standby register"
clone the standby on "10.248.52.*" ... OK
start up the standby on "10.248.52.*" ...
/home/kingbase/cluster/kingbase/bin/sys_ctl -w -t 60 -l /home/kingbase/cluster/kingbase/logfile -D /home/kingbase/cluster/kingbase/data start
waiting for server to start.... done
server started
start up the standby on "10.248.52.*" ... OK
register the standby on "10.248.52.*" ...
INFO: connecting to local node "node2" (ID: 2)
INFO: connecting to primary database
WARNING: --upstream-node-id not supplied, assuming upstream node is primary (node ID 1)
INFO: standby registration complete
NOTICE: standby node "node2" (ID: 2) successfully registered
[INSTALL] register the standby on "10.248.52.*" ... OK
[INSTALL] start up the whole cluster ...
2021-04-19 17:31:58 Ready to start all DB ...
2021-04-19 17:31:58 begin to start DB on "[10.248.52.*]".
2021-04-19 17:31:59 DB on "[10.248.52.*]" already started, connect to check it.
2021-04-19 17:32:00 DB on "[10.248.52.*]" start success.
2021-04-19 17:32:00 Try to ping trusted_servers on host 10.248.52.* ...
2021-04-19 17:32:02 Try to ping trusted_servers on host 10.248.52.* ...
2021-04-19 17:32:05 begin to start DB on "[10.248.52.*]".
2021-04-19 17:32:05 DB on "[10.248.52.*]" already started, connect to check it.
2021-04-19 17:32:06 DB on "[10.248.52.*]" start success.
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+-------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
1 | node1 | primary | * running | | default | 100 | 1 | user=esrep dbname=esrep port=54321 host=10.248.52.* connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
2 | node2 | standby | running | node1 | default | 100 | 1 | user=esrep dbname=esrep port=54321 host=10.248.52.* connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
2021-04-19 17:32:06 The primary DB is started.
2021-04-19 17:32:12 Success to load virtual ip [10.248.52.*] on primary host [10.248.52.*].
2021-04-19 17:32:12 Try to ping vip on host 10.248.52.* ...
2021-04-19 17:32:14 Try to ping vip on host 10.248.52.* ...
2021-04-19 17:32:17 begin to start repmgrd on "[10.248.52.*]".
[2021-04-19 17:32:17] [NOTICE] using provided configuration file "/home/kingbase/cluster/kingbase/bin/../etc/repmgr.conf"
[2021-04-19 17:32:17] [NOTICE] redirecting logging output to "/home/kingbase/cluster/kingbase/hamgr.log"
2021-04-19 17:32:17 repmgrd on "[10.248.52.*]" start success.
2021-04-19 17:32:17 begin to start repmgrd on "[10.248.52.*]".
[2021-04-19 15:15:45] [NOTICE] using provided configuration file "/home/kingbase/cluster/kingbase/bin/../etc/repmgr.conf"
[2021-04-19 15:15:45] [NOTICE] redirecting logging output to "/home/kingbase/cluster/kingbase/hamgr.log"
2021-04-19 17:32:18 repmgrd on "[10.248.52.*]" start success.
ID | Name | Role | Status | Upstream | repmgrd | PID | Paused? | Upstream last seen
----+-------+---------+-----------+----------+---------+-------+---------+--------------------
1 | node1 | primary | * running | | running | 62956 | no | n/a
2 | node2 | standby | running | node1 | running | 25769 | no | 0 second(s) ago
/home/kingbase/cluster/kingbase/bin/../etc/all_nodes_tools.conf does not exist
/home/kingbase/cluster/kingbase/bin/../etc/all_nodes_tools.conf does not exist
2021-04-19 17:32:22 Done.
[INSTALL] start up the whole cluster ... OK
=== 根據以上資訊獲知,叢集手工部署成功===
四、檢視叢集部署後的狀態
4.1 檢視資料庫服務狀態(主庫)
[kingbase@ECOLABAPP37 ~]$ ps -ef |grep kingbase
kingbase 62335 1 0 17:31 ? 00:00:00 /home/kingbase/cluster/kingbase/bin/kingbase -D /home/kingbas/cluster/kingbase/data
kingbase 62336 62335 0 17:31 ? 00:00:00 kingbase: logger
kingbase 62338 62335 0 17:31 ? 00:00:00 kingbase: checkpointer
kingbase 62339 62335 0 17:31 ? 00:00:00 kingbase: background writer
kingbase 62340 62335 0 17:31 ? 00:00:00 kingbase: walwriter
kingbase 62341 62335 0 17:31 ? 00:00:00 kingbase: autovacuum launcher
kingbase 62342 62335 0 17:31 ? 00:00:00 kingbase: archiver last was 000000010000000000000002.00000028.backup
kingbase 62343 62335 0 17:31 ? 00:00:00 kingbase: stats collector
kingbase 62344 62335 0 17:31 ? 00:00:00 kingbase: ksh writer
kingbase 62345 62335 0 17:31 ? 00:00:00 kingbase: ksh collector
kingbase 62346 62335 0 17:31 ? 00:00:00 kingbase: sys_kwr collector
kingbase 62347 62335 0 17:31 ? 00:00:00 kingbase: logical replication launcher
kingbase 62426 62335 0 17:31 ? 00:00:00 kingbase: walsender esrep 10.248.52.*(52926) streaming 0/300B810
kingbase 62954 62335 0 17:32 ? 00:00:00 kingbase: esrep esrep 10.248.52.*(47290) idle
kingbase 62956 1 0 17:32 ? 00:00:00 /home/kingbase/cluster/kingbase/bin/repmgrd -d -v -f /home/kingbase/cluster/kingbase/bin/../etc/repmgr.conf
kingbase 62966 62335 0 17:32 ? 00:00:00 kingbase: esrep esrep 10.248.52.*(52934) idle
kingbase 63178 1 0 17:32 ? 00:00:00 /home/kingbase/cluster/kingbase/bin/kbha -A daemon -f /home/kingbase/cluster/kingbase/bin/../etc/repmgr.conf
kingbase 63822 63178 0 17:35 ? 00:00:00 ping -q -c3 -w2 10.248.48.*
4.2 主備流複製狀態
[kingbase@ECOLABAPP37 ~]$ ksql -U system test
ksql (V8.0)
Type "help" for help.
test=# select * from sys_stat_replication;
pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_s
tart | backend_xmin | state | sent_lsn | write_lsn | flush_lsn | replay_lsn | write_lag | flush_lag |
replay_lag | sync_priority | sync_state | reply_time
-------+----------+---------+------------------+---------------+-----------------+------
62426 | 16385 | esrep | node2 | 10.248.52.* | | 52926 | 2021-04-19 17:31:
57.986053+08 | | streaming | 0/300B810 | 0/300B810 | 0/300B810 | 0/300B810 | | |
| 1 | quorum | 2021-04-19 15:19:35.941223+08
(1 row)
4.3 檢視叢集節點狀態
[kingbase@ECOLABAPP37 ~]$ repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+-------+---------+-----------+----------+----------+----------+----------+--------
1 | node1 | primary | * running | | default | 100 | 1 | user=esrep dbname=esrep port=54321 host=10.248.52.* connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
2 | node2 | standby | running | node1 | default | 100 | 1 | user=esrep dbname=esrep port=54321 host=10.248.52.* connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
4.4 測試主備流複製同步
主庫DML操作:
test=# create database prod;
CREATE DATABASE
test=# \c prod;
You are now connected to database "prod" as user "system".
prod=# create table t1 (id int);
CREATE TABLE
prod=# insert into t1 values (10),(20),(30);
INSERT 0 3
prod=# select * from t1;
id
----
10
20
30
(3 rows)
備庫檢視同步資料:
[kingbase@ECOLABAPP38 ~]$ ksql -U system test
ksql (V8.0)
Type "help" for help.
test=# \c prod
You are now connected to database "prod" as user "system".
prod=# select * from t1;
id
----
10
20
30
(3 rows)
五、部署故障案例
故障現象說明:
沒有將license.dat檔案存放到叢集部署指令碼的當前目錄下,在執行部署指令碼時,出現故障,無法訪問到license.dat檔案,後將license.dat檔案拷貝到此目錄後,部署成功。
[kingbase@ECOLABAPP37 ~]$ cd R6_install/
[kingbase@ECOLABAPP37 R6_install]$ ls -lh
total 80K
-rw------- 1 kingbase kingbase 5.0K Apr 19 17:28 install.conf
-r-xr-xr-x 1 kingbase kingbase 2.1K Apr 19 16:57 trust_cluster.sh
-rw------- 1 kingbase kingbase 32K Apr 19 16:57 V8R6_cluster_install.sh
-rw------- 1 kingbase kingbase 31K Apr 19 16:57 V8R6一鍵部署叢集指令碼操作手冊.docx
[kingbase@ECOLABAPP37 R6_install]$ sh V8R6_cluster_install.sh
[CONFIG_CHECK] file format is correct ... OK
[CONFIG_CHECK] check if the virtual ip "10.248.52.*" already exist ...
[CONFIG_CHECK] there is no "10.248.52.*" on any host, OK
[CONFIG_CHECK] the number of net_device matches the length of all_ip or the number of net_device is 1 ... OK
[CONFIG_CHECK] the number of license_num matches the length of all_ip or the number of license_num is 1 ... OK
[RUNNING] check if the host can be reached ...
[RUNNING] success connect to the target "10.248.52.*" ..... OK
[RUNNING] success connect to the target "10.248.52.*" ..... OK
[RUNNING] check the db is running or not...
[RUNNING] the db is not running on "10.248.52.*:54321" ..... OK
[RUNNING] the db is not running on "10.248.52.*:54321" ..... OK
[RUNNING] check if the install dir is already exist ...
[RUNNING] the install dir is not exist on "10.248.52.*" ..... OK
[RUNNING] the install dir is not exist on "10.248.52.*" ..... OK
[INSTALL] create the install dir "/home/kingbase/cluster/kingbase" on every host ...
[INSTALL] success to create the install dir "/home/kingbase/cluster/kingbase" on "10.248.52.*" ..... OK
[INSTALL] success to create the install dir "/home/kingbase/cluster/kingbase" on "10.248.52.*" ..... OK
[INSTALL] decompress the "/opt/Kingbase/ES/V8/DeployTools/zip/Aarch64/db.zip" to "/home/kingbase/cluster/kingbase"
[INSTALL] success to decompress the "/opt/Kingbase/ES/V8/DeployTools/zip/Aarch64/db.zip" to "/home/kingbase/cluster/kingbase" on "10.248.52.*"..... OK
[INSTALL] create the dir "/home/kingbase/cluster/kingbase/etc" on all host
[INSTALL] scp the dir "/home/kingbase/cluster/kingbase" to other host
[INSTALL] try to copy the install dir "/home/kingbase/cluster/kingbase" to "10.248.52.*" .....
[INSTALL] success to scp the install dir "/home/kingbase/cluster/kingbase" to "10.248.52.*" ..... OK
[RUNNING] chmod u+s for "/sbin" and "/usr/sbin"
[RUNNING] chmod u+s /sbin/ip on "10.248.52.*" ..... OK
[RUNNING] chmod u+s /usr/sbin/arping on "10.248.52.*" ..... OK
[RUNNING] chmod u+s /sbin/ip on "10.248.52.*" ..... OK
[RUNNING] chmod u+s /usr/sbin/arping on "10.248.52.*" ..... OK
[INSTALL] check license_file "license.dat"
[INSTALL] Copy license to /home/kingbase/cluster/kingbase/../: license.dat
[INSTALL] success to copy /home/kingbase/R6_install/license.dat to /home/kingbase/cluster/kingbase/../ on 10.248.52.*
[INSTALL] check license_file "license.dat"
[INSTALL] Copy license to /home/kingbase/cluster/kingbase/../: license.dat
[INSTALL] success to copy /home/kingbase/R6_install/license.dat to /home/kingbase/cluster/kingbase/../ on 10.248.52.*
[INSTALL] begin to init the database on "10.248.52.*" ...
The files belonging to this database system will be owned by user "kingbase".
This user must also own the server process.
The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".
Data page checksums are disabled.
creating directory /home/kingbase/cluster/kingbase/data ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... Asia/Shanghai
creating configuration files ... ok
Begin setup encrypt device
initializing the encrypt device ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
create security database ... ok
load security database ... ok
create initial audit rules ... ok
syncing data to disk ... ok
Success. You can now start the database server using:
/home/kingbase/cluster/kingbase/bin/sys_ctl -D /home/kingbase/cluster/kingbase/data -l logfile start
[INSTALL] end to init the database on "10.248.52.*" ... OK
[INSTALL] wirte the kingbase.conf on "10.248.52.*" ...
[INSTALL] wirte the kingbase.conf on "10.248.52.*" ... OK
[INSTALL] wirte the es_rep.conf on "10.248.52.*" ...
[INSTALL] wirte the es_rep.conf on "10.248.52.*" ... OK
[INSTALL] wirte the sys_hba.conf on "10.248.52.*" ...
[INSTALL] wirte the sys_hba.conf on "10.248.52.*" ... OK
[INSTALL] wirte the .encpwd on every host
[INSTALL] write the repmgr.conf on every host
[INSTALL] write the repmgr.conf on "10.248.52.*" ...
[INSTALL] write the repmgr.conf on "10.248.52.*" ... OK
[INSTALL] write the repmgr.conf on "10.248.52.*" ...
[INSTALL] write the repmgr.conf on "10.248.52.*" ... OK
[INSTALL] start up the database on "10.248.52.*" ...
[INSTALL] /home/kingbase/cluster/kingbase/bin/sys_ctl -w -t 60 -l /home/kingbase/cluster/kingbase/logfile -D /home/kingbase/cluster/kingbase/data start
waiting for server to start.... stopped waiting
sys_ctl: could not start server
Examine the log output.
=注意:必須將license.dat檔案也存放到當前目錄下,以上錯誤就是缺少license.dat=
在排除故障時,可以手工執行一下命令,然後檢視故障日誌:
/home/kingbase/cluster/kingbase/bin/sys_ctl -w -t 60 -l /home/kingbase/cluster/kingbase/logfile -D /home/kingbase/cluster/kingba
KINGBASE研究院