corosync+pacemaker實現pg流復制自動切換
阿新 • • 發佈:2017-09-07
mov out ros 綁定 節點 war backend 故障恢復 pgsql
一、環境 $ cat /etc/redhat-release CentOS Linux release 7.0.1406 (Core) node1: 192.168.111.128 node2: 192.168.111.129 vip-master: 192.168.111.228 vip-slave: 192.168.111.229 配置主機名 hostnamectl set-hostname postgres128 hostnamectl set-hostname postgres129 [root@postgres128 ~]# vi /etc/hosts 192.168.111.128postgres128 192.168.111.129 postgres129 二、配置Linux集群環境 1. 安裝Pacemaker和Corosync 在所有節點執行: [root@postgres128 ~]# yum install -y pacemaker pcs psmisc policycoreutils-python 2. 禁用防火墻 在所有節點執行: [root@postgres128 ~]# systemctl disable firewalld.service [root@postgres128 ~]# systemctl stop firewalld.service3. 啟用pcs 在所有節點執行: [root@postgres128 ~]# systemctl start pcsd.service [root@postgres128 ~]# systemctl enable pcsd.service [root@postgres128 ~]# echo hacluster | sudo passwd hacluster --stdin 4. 集群認證 在任何一個節點上執行,這裏選擇node1: [root@postgres128 ~]# pcs cluster auth -u hacluster -p hacluster 192.168.111.128 192.168.111.1295. 同步配置 在node1上執行: [root@postgres128 ~]# pcs cluster setup --last_man_standing=1 --name pgcluster 192.168.111.128 192.168.111.129 Shutting down pacemaker/corosync services... Redirecting to /bin/systemctl stop pacemaker.service Redirecting to /bin/systemctl stop corosync.service Killing any remaining services... Removing all cluster configuration files... 192.168.111.128: Succeeded 192.168.111.129: Succeeded 6. 啟動集群 在node1上執行: [root@postgres128 ~]# pcs cluster start --all 192.168.111.128: Starting Cluster... 192.168.111.129: Starting Cluster... 7. 檢驗 1)檢驗corosync 在node1上執行: $ sudo pcs status corosync Membership information ---------------------- Nodeid Votes Name 1 1 192.168.111.128 (local) 2 1 192.168.111.129 2)檢驗pacemaker [root@postgres128 ~]# pcs status Cluster name: pgcluster WARNING: no stonith devices and stonith-enabled is not false Last updated: Wed Sep 6 02:20:28 2017 Last change: Current DC: NONE 0 Nodes configured 0 Resources configured Full list of resources: PCSD Status: 192.168.111.128: Online 192.168.111.129: Online Daemon Status: corosync: active/disabled pacemaker: active/disabled pcsd: active/enabled 三、安裝和配置PostgreSQL 配置參考 查看PG流復制結果。 [postgres@localhost ~]$ psql psql (9.6.0) Type "help" for help. postgres=# \x Expanded display is on. postgres=# select * from pg_stat_replication ; -[ RECORD 1 ]----+------------------------------ pid | 2111 usesysid | 16384 usename | replica application_name | standby01 client_addr | 192.168.111.129 client_hostname | client_port | 45608 backend_start | 2017-09-06 05:13:29.766227-04 backend_xmin | 1756 state | streaming sent_location | 0/50000D0 write_location | 0/50000D0 flush_location | 0/50000D0 replay_location | 0/5000098 sync_priority | 1 sync_state | sync 停止PostgreSQL服務 $ pg_stop waiting for server to shut down...... done server stopped 四、配置自動切換 1. 配置 在node1執行: 將配置步驟先寫到腳本 # 將cib配置保存到文件 [root@postgres128 postgres]# vi cluster_setup.sh pcs cluster cib pgsql_cfg # 在pacemaker級別忽略quorum pcs -f pgsql_cfg property set no-quorum-policy="ignore" # 禁用STONITH pcs -f pgsql_cfg property set stonith-enabled="false" # 設置資源粘性,防止節點在故障恢復後發生遷移 pcs -f pgsql_cfg resource defaults resource-stickiness="INFINITY" # 設置多少次失敗後遷移 pcs -f pgsql_cfg resource defaults migration-threshold="3" # 設置master節點虛ip pcs -f pgsql_cfg resource create vip-master IPaddr2 ip="192.168.111.228" cidr_netmask="24" op start timeout="60s" interval="0s" on-fail="restart" op monitor timeout="60s" interval="10s" on-fail="restart" op stop timeout="60s" interval="0s" on-fail="block" # 設置slave節點虛ip pcs -f pgsql_cfg resource create vip-slave IPaddr2 ip="192.168.111.229" cidr_netmask="24" op start timeout="60s" interval="0s" on-fail="restart" op monitor timeout="60s" interval="10s" on-fail="restart" op stop timeout="60s" interval="0s" on-fail="block" # 設置pgsql集群資源 # pgctl、psql、pgdata和config等配置根據自己的環境修改,node list填寫節點的hostname,master_ip填寫虛master_ip pcs -f pgsql_cfg resource create pgsql pgsql pgctl="/opt/pgsql96/bin/pg_ctl" psql="/opt/pgsql96/bin/psql" pgdata="/home/postgres/data" config="/home/postgres/data/postgresql.conf" rep_mode="sync" node_list="postgres128 postgres129" master_ip="192.168.111.228" repuser="replica" primary_conninfo_opt="password=replica keepalives_idle=60 keepalives_interval=5 keepalives_count=5" restore_command="cp /home/postgres/arch/%f %p" restart_on_promote=‘true‘ op start timeout="60s" interval="0s" on-fail="restart" op monitor timeout="60s" interval="4s" on-fail="restart" op monitor timeout="60s" interval="3s" on-fail="restart" role="Master" op promote timeout="60s" interval="0s" on-fail="restart" op demote timeout="60s" interval="0s" on-fail="stop" op stop timeout="60s" interval="0s" on-fail="block" # 設置master/slave模式,clone-max=2,兩個節點 pcs -f pgsql_cfg resource master pgsql-cluster pgsql master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true # 配置master ip組 pcs -f pgsql_cfg resource group add master-group vip-master # 配置slave ip組 pcs -f pgsql_cfg resource group add slave-group vip-slave # 配置master ip組綁定master節點 pcs -f pgsql_cfg constraint colocation add master-group with master pgsql-cluster INFINITY # 配置啟動master節點 pcs -f pgsql_cfg constraint order promote pgsql-cluster then start master-group symmetrical=false score=INFINITY # 配置停止master節點 pcs -f pgsql_cfg constraint order demote pgsql-cluster then stop master-group symmetrical=false score=0 # 配置slave ip組綁定slave節點 pcs -f pgsql_cfg constraint colocation add slave-group with slave pgsql-cluster INFINITY # 配置啟動slave節點 pcs -f pgsql_cfg constraint order promote pgsql-cluster then start slave-group symmetrical=false score=INFINITY # 配置停止slave節點 pcs -f pgsql_cfg constraint order demote pgsql-cluster then stop slave-group symmetrical=false score=0 # 把配置文件push到cib pcs cluster cib-push pgsql_cfg 2. 執行加載命令 [root@postgres128 postgres]# chmod +x cluster_setup.sh [root@postgres128 postgres]# ./cluster_setup.sh Adding pgsql-cluster master-group (score: INFINITY) (Options: symmetrical=false score=INFINITY first-action=promote then-action=start) Adding pgsql-cluster master-group (score: 0) (Options: symmetrical=false score=0 first-action=demote then-action=stop) Adding pgsql-cluster slave-group (score: INFINITY) (Options: symmetrical=false score=INFINITY first-action=promote then-action=start) Adding pgsql-cluster slave-group (score: 0) (Options: symmetrical=false score=0 first-action=demote then-action=stop) CIB updated 3. 啟動數據庫 3查看pcs status [root@postgres129 ~]# pcs status Cluster name: pgcluster Last updated: Wed Sep 6 22:13:39 2017 Last change: Wed Sep 6 06:00:26 2017 via cibadmin on postgres128 Stack: corosync Current DC: postgres128 (1) - partition with quorum Version: 1.1.10-29.el7-368c726 2 Nodes configured 4 Resources configured Online: [ postgres128 postgres129 ] Full list of resources: Master/Slave Set: pgsql-cluster [pgsql] Stopped: [ postgres128 postgres129 ] Resource Group: master-group vip-master (ocf::heartbeat:IPaddr2): Started postgres128 Resource Group: slave-group vip-slave (ocf::heartbeat:IPaddr2): Started postgres129 PCSD Status: 192.168.111.128: Unable to authenticate 192.168.111.129: Unable to authenticate Daemon Status: corosync: active/disabled pacemaker: active/disabled pcsd: active/enabled
參考文獻:
搭建+維護
https://my.oschina.net/aven92/blog/518928
https://my.oschina.net/aven92/blog/519458
corosync+pacemaker實現pg流復制自動切換