關於postgresql-XL叢集的入門配置

阿新 • • 發佈：2019-01-11

一 Postgres-XL簡介

https://www.postgres-xl.org/documentation/tutorial-createcluster.html 這是官網的配置，只是一臺機器的1gtm + 2 coord + 2 node 都在一臺機器上

Postgres的-XL是一個基於PostgreSQL資料庫的橫向擴充套件開源SQL資料庫叢集，具有足夠的靈活性來處理不同的資料庫工作負載:

完全ACID，保持事務一致性
OLTP 寫頻繁的業務
需要MPP並行性商業智慧/大資料分析
操作資料儲存
Key-value 儲存
GIS的地理空間
混合業務工作環境
多租戶服務提供商託管環境
Web 2.0

Postgres-XL架構

二元件簡介

Global Transaction Monitor (GTM) 全域性事務管理器，確保群集範圍內的事務一致性。 GTM負責發放事務ID和快照作為其多版本併發控制的一部分。叢集可選地配置一個備用GTM，以改進可用性。此外，可以在協調器間配置代理GTM，可用於改善可擴充套件性，減少GTM的通訊量。
GTM Standby GTM的備節點，在pgxc,pgxl中，GTM控制所有的全域性事務分配，如果出現問題，就會導致整個叢集不可用，為了增加可用性，增加該備用節點。當GTM出現問題時，GTM Standby可以升級為GTM，保證叢集正常工作。
GTM-Proxy GTM需要與所有的Coordinators通訊，為了降低壓力，可以在每個Coordinator機器上部署一個GTM-Proxy。
Coordinator 協調員管理使用者會話，並與GTM和資料節點進行互動。協調員解析，並計劃查詢，並給語句中的每一個元件傳送下一個序列化的全域性性計劃。為節省機器，通常此服務和資料節點部署在一起。
Data Node 資料節點是資料實際儲存的地方。資料的分佈可以由DBA來配置。為了提高可用性，可以配置資料節點的熱備以便進行故障轉移準備。

總結：gtm是負責ACID的，保證分散式資料庫全域性事務一致性。得益於此，就算資料節點是分佈的，但是你在主節點操作增刪改查事務時，就如同只操作一個數據庫一樣簡單。Coordinator是排程的，將操作指令傳送到各個資料節點。datanodes是資料節點，分散式儲存資料。

三 Postgres-XL環境配置與安裝

3.1 叢集規劃

準備三臺Centos7伺服器（或者虛擬機器），叢集規劃如下：

主機名	IP	角色	埠	nodename	資料目錄
gtm	192.168.0.125	GTM	6666	gtm	/nodes/gtm
GTM Slave	20001	gtmSlave	/nodes/gtmSlave
datanode1	192.168.0.127	Coordinator	5432	coord1	/nodes/coord
Datanode	5433	node1	/nodes/dn_master
Datanode Slave	15433	node1_slave	/nodes/dn_slave
GTM Proxy	6666	gtm_pxy1	/nodes/gtm_pxy
datanode2	192.168.0.128	Coordinator	5432	coord2	/nodes/coord
Datanode	5433	node2	nodes/dn_master
Datanode Slave	15433	node2_slave	/nodes/dn_slave
GTM Proxy	6666	gtm_pxy2	/nodes/gtm_pxy

在每臺機器的 /etc/hosts中加入以下內容：

192.168.0.125 gtm
192.168.0.126 datanode1
192.168.0.127 datanode2

gtm上部署gtm，gtm_sandby測試環境暫未部署。
Coordinator與Datanode節點一般部署在同一臺機器上。實際上，GTM-proxy,Coordinator與Datanode節點一般都在同一個機器上，使用時避免埠號與連線池埠號重疊！規劃datanode1,datanode2作為協調節點與資料節點。

3.2 系統環境設定

以下操作，對每個伺服器節點都適用。
關閉防火牆：

[[email protected] ~]# systemctl stop firewalld.service
[[email protected] ~]# systemctl disable firewalld.service

selinux設定:

[[email protected] ~]#vim /etc/selinux/config

設定SELINUX=disabled，儲存退出。

# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
#     enforcing - SELinux security policy is enforced.
#     permissive - SELinux prints warnings instead of enforcing.
#     disabled - No SELinux policy is loaded.
SELINUX=disabled
# SELINUXTYPE= can take one of three two values:
#     targeted - Targeted processes are protected,
#     minimum - Modification of targeted policy. Only selected processes are protected.
#     mls - Multi Level Security protection.

安裝依賴包：

[[email protected] ~]# yum install -y flex bison readline-devel zlib-devel openjade docbook-style-dsssl

重啟伺服器！一定要重啟！

3.3 新建使用者

每個節點都建立使用者postgres，並且建立.ssh目錄，並配置相應的許可權：

[[email protected] ~]# useradd postgres
[[email protected] ~]# passwd postgres
[[email protected] ~]# su - postgres
[[email protected] ~]# mkdir ~/.ssh
[[email protected] ~]# chmod 700 ~/.ssh

3.4 ssh免密碼登入

僅僅在gtm節點配置如下操作：

[[email protected] ~]# su - postgres
[[email protected] ~]# ssh-keygen -t rsa
[[email protected] ~]# cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
[[email protected] ~]# chmod 600 ~/.ssh/authorized_keys

將剛生成的認證檔案拷貝到datanode1到datanode2中，使得gtm節點可以免密碼登入datanode1~datanode2的任意一個節點：

[[email protected] ~]# scp ~/.ssh/authorized_keys postgres@datanode1:~/.ssh/
[[email protected] ~]# scp ~/.ssh/authorized_keys postgres@datanode2:~/.ssh/

對所有提示都不要輸入，直接enter下一步。直到最後，因為第一次要求輸入目標機器的使用者密碼，輸入即可。

3.5 Postgres-XL安裝

pg1-pg3每個節點都需安裝配置。切換回root使用者下，執行如下步驟安裝

[[email protected] ~]#  cd /opt
[[email protected] ~]# git clone git://git.postgresql.org/git/postgres-xl.git
[[email protected] ~]# cd postgres-xl
[[email protected] ~postgres-xl]# ./configure --prefix=/home/postgres/pgxl/
[[email protected] ~postgres-xl]# make
[[email protected] ~postgres-xl]# make install
[[email protected] ~postgres-xl]# cd contrib/  
[[email protected] ~contrib]# make
[[email protected] ~contrib]# make install

[[email protected] ~contrib]#chown -R postgres.postgres /home/postgres/pgxl

cortrib中有很多postgres很牛的工具，一般要裝上。如ltree,uuid,postgres_fdw等等。

3.6 配置環境變數

進入postgres使用者，修改其環境變數，開始編輯

[[email protected] ~]#su - postgres
[[email protected] ~]#vi .bashrc

在開啟的檔案末尾，新增如下變數配置：

export PGHOME=/home/postgres/pgxl
export LD_LIBRARY_PATH=$PGHOME/lib:$LD_LIBRARY_PATH
export PATH=$PGHOME/bin:$PATH

按住esc，然後輸入:wq!儲存退出。輸入以下命令對更改重啟生效。

[[email protected] ~]# source .bashrc
#輸入以下語句，如果輸出變數結果，代表生效
[[email protected] ~]# echo $PGHOME
#應該輸出/home/postgres/pgxl代表生效

如上操作，除特別強調以外，是datanode1-datanode2節點都要配置安裝的。

四叢集配置

4.1 生成pgxc_ctl配置檔案

[[email protected] ~]# pgxc_ctl        （注意，這裡只有GTM配置即可）

PGXC prepare config empty
 ---執行該命令將會生成一份配置檔案模板
PGXC   ---按ctrl c退出。

4.2 配置pgxc_ctl.conf （也是隻用配置GTM）

在pgxc_ctl資料夾中存在一個pgxc_ctl.conf檔案，編輯如下：

pgxcInstallDir=$PGHOME
pgxlDATA=$PGHOME/data 

pgxcOwner=postgres

#---- GTM Master -----------------------------------------
gtmName=gtm
gtmMasterServer=gtm
gtmMasterPort=6666
gtmMasterDir=$pgxlDATA/nodes/gtm

gtmSlave=y                  # Specify y if you configure GTM Slave.   Otherwise, GTM slave will not be configured and
                            # all the following variables will be reset.
gtmSlaveName=gtmSlave
gtmSlaveServer=gtm      # value none means GTM slave is not available.  Give none if you don't configure GTM Slave.
gtmSlavePort=20001          # Not used if you don't configure GTM slave.
gtmSlaveDir=$pgxlDATA/nodes/gtmSlave    # Not used if you don't configure GTM slave.

#---- GTM-Proxy Master -------
gtmProxyDir=$pgxlDATA/nodes/gtm_proxy
gtmProxy=y                              
gtmProxyNames=(gtm_pxy1 gtm_pxy2)   
gtmProxyServers=(datanode1 datanode2)           
gtmProxyPorts=(6666 6666)               
gtmProxyDirs=($gtmProxyDir $gtmProxyDir)            

#---- Coordinators ---------
coordMasterDir=$pgxlDATA/nodes/coord
coordNames=(coord1 coord2)      
coordPorts=(5432 5432)          
poolerPorts=(6667 6667)         
coordPgHbaEntries=(0.0.0.0/0)

coordMasterServers=(datanode1 datanode2)    
coordMasterDirs=($coordMasterDir $coordMasterDir)
coordMaxWALsender=0    #沒設定備份節點，設定為0
coordMaxWALSenders=($coordMaxWALsender $coordMaxWALsender) #數量保持和coordMasterServers一致

coordSlave=n

#---- Datanodes ----------
datanodeMasterDir=$pgxlDATA/nodes/dn_master
primaryDatanode=node1               # 主資料節點
datanodeNames=(node1 node2)
datanodePorts=(5433 5433)   
datanodePoolerPorts=(6668 6668) 
datanodePgHbaEntries=(0.0.0.0/0)

datanodeMasterServers=(datanode1 datanode2)
datanodeMasterDirs=($datanodeMasterDir $datanodeMasterDir)
datanodeMaxWalSender=4
datanodeMaxWALSenders=($datanodeMaxWalSender $datanodeMaxWalSender)

datanodeSlave=n
#將datanode1節點的slave做到了datanode2伺服器上，交叉做了備份
datanodeSlaveServers=(datanode2 datanode1)  # value none means this slave is not available
datanodeSlavePorts=(15433 15433)    # value none means this slave is not available
datanodeSlavePoolerPorts=(20012 20012)  # value none means this slave is not available
datanodeSlaveSync=y     # If datanode slave is connected in synchronized mode
datanodeSlaveDirs=($datanodeSlaveDir $datanodeSlaveDir)
datanodeArchLogDirs=( $datanodeArchLogDir $datanodeArchLogDir)

如上配置，都沒有配置slave，具體生產環境，請閱讀配置檔案，自行配置。

4.3 叢集初始化，啟動，停止（在gtm機器中啟動即可）

第一次啟動叢集，需要初始化，初始化如下：

[[email protected] pgxc_ctl]$ pgxc_ctl -c /home/postgres/pgxc_ctl/pgxc_ctl.conf init all

初始化後會直接啟動叢集。

/bin/bash
Installing pgxc_ctl_bash script as /home/postgres/pgxc_ctl/pgxc_ctl_bash.
Installing pgxc_ctl_bash script as /home/postgres/pgxc_ctl/pgxc_ctl_bash.
Reading configuration using /home/postgres/pgxc_ctl/pgxc_ctl_bash --home /home/postgres/pgxc_ctl --configuration /home/postgres/pgxc_ctl/pgxc_ctl.conf
Finished reading configuration.
   ******** PGXC_CTL START ***************

Current directory: /home/postgres/pgxc_ctl
Initialize GTM master
ERROR: target directory (/home/postgres/pgxl/data/nodes/gtm) exists and not empty. Skip GTM initilialization
1:1430554432:2017-07-11 23:31:14.737 PDT -FATAL:  lock file "gtm.pid" already exists
2:1430554432:2017-07-11 23:31:14.737 PDT -HINT:  Is another GTM (PID 2823) running in data directory "/home/postgres/pgxl/data/nodes/gtm"?
LOCATION:  CreateLockFile, main.c:2099
waiting for server to shut down.... done
server stopped
Done.
Start GTM master
server starting
Initialize all the gtm proxies.
Initializing gtm proxy gtm_pxy1.
Initializing gtm proxy gtm_pxy2.
The files belonging to this GTM system will be owned by user "postgres".
This user must also own the server process.


fixing permissions on existing directory /home/postgres/pgxl/data/nodes/gtm_pxy ... ok
creating configuration files ... ok

Success.
The files belonging to this GTM system will be owned by user "postgres".
This user must also own the server process.


fixing permissions on existing directory /home/postgres/pgxl/data/nodes/gtm_pxy ... ok
creating configuration files ... ok

Success.
Done.
Starting all the gtm proxies.
Starting gtm proxy gtm_pxy1.
Starting gtm proxy gtm_pxy2.
server starting
server starting
Done.
Initialize all the coordinator masters.
Initialize coordinator master coord1.
Initialize coordinator master coord2.
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

fixing permissions on existing directory /home/postgres/pgxl/data/nodes/coord ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting dynamic shared memory implementation ... posix
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... creating cluster information ... ok
syncing data to disk ... ok
freezing database template0 ... ok
freezing database template1 ... ok
freezing database postgres ... ok

WARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success.
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

fixing permissions on existing directory /home/postgres/pgxl/data/nodes/coord ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting dynamic shared memory implementation ... posix
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... creating cluster information ... ok
syncing data to disk ... ok
freezing database template0 ... ok
freezing database template1 ... ok
freezing database postgres ... ok

WARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success.
Done.
Starting coordinator master.
Starting coordinator master coord1
Starting coordinator master coord2
2017-07-11 23:31:31.116 PDT [3650] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2017-07-11 23:31:31.116 PDT [3650] LOG:  listening on IPv6 address "::", port 5432
2017-07-11 23:31:31.118 PDT [3650] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
2017-07-11 23:31:31.126 PDT [3650] LOG:  redirecting log output to logging collector process
2017-07-11 23:31:31.126 PDT [3650] HINT:  Future log output will appear in directory "pg_log".
2017-07-11 23:31:31.122 PDT [3613] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2017-07-11 23:31:31.122 PDT [3613] LOG:  listening on IPv6 address "::", port 5432
2017-07-11 23:31:31.124 PDT [3613] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
2017-07-11 23:31:31.132 PDT [3613] LOG:  redirecting log output to logging collector process
2017-07-11 23:31:31.132 PDT [3613] HINT:  Future log output will appear in directory "pg_log".
Done.
Initialize all the datanode masters.
Initialize the datanode master datanode1.
Initialize the datanode master datanode2.
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

fixing permissions on existing directory /home/postgres/pgxl/data/nodes/dn_master ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting dynamic shared memory implementation ... posix
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... creating cluster information ... ok
syncing data to disk ... ok
freezing database template0 ... ok
freezing database template1 ... ok
freezing database postgres ... ok

WARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success.
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

fixing permissions on existing directory /home/postgres/pgxl/data/nodes/dn_master ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting dynamic shared memory implementation ... posix
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... creating cluster information ... ok
syncing data to disk ... ok
freezing database template0 ... ok
freezing database template1 ... ok
freezing database postgres ... ok

WARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success.
Done.
Starting all the datanode masters.
Starting datanode master datanode1.
Starting datanode master datanode2.
2017-07-11 23:31:37.013 PDT [3995] LOG:  listening on IPv4 address "0.0.0.0", port 5433
2017-07-11 23:31:37.013 PDT [3995] LOG:  listening on IPv6 address "::", port 5433
2017-07-11 23:31:37.014 PDT [3995] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5433"
2017-07-11 23:31:37.021 PDT [3995] LOG:  redirecting log output to logging collector process
2017-07-11 23:31:37.021 PDT [3995] HINT:  Future log output will appear in directory "pg_log".
2017-07-11 23:31:37.008 PDT [3958] LOG:  listening on IPv4 address "0.0.0.0", port 5433
2017-07-11 23:31:37.008 PDT [3958] LOG:  listening on IPv6 address "::", port 5433
2017-07-11 23:31:37.009 PDT [3958] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5433"
2017-07-11 23:31:37.017 PDT [3958] LOG:  redirecting log output to logging collector process
2017-07-11 23:31:37.017 PDT [3958] HINT:  Future log output will appear in directory "pg_log".
Done.
ALTER NODE coord1 WITH (HOST='datanode1', PORT=5432);
ALTER NODE
CREATE NODE coord2 WITH (TYPE='coordinator', HOST='datanode2', PORT=5432);
CREATE NODE
CREATE NODE datanode1 WITH (TYPE='datanode', HOST='datanode1', PORT=5433, PRIMARY, PREFERRED);
CREATE NODE
CREATE NODE datanode2 WITH (TYPE='datanode', HOST='datanode2', PORT=5433);
CREATE NODE
SELECT pgxc_pool_reload();
 pgxc_pool_reload 
------------------
 t
(1 row)

CREATE NODE coord1 WITH (TYPE='coordinator', HOST='datanode1', PORT=5432);
CREATE NODE
ALTER NODE coord2 WITH (HOST='datanode2', PORT=5432);
ALTER NODE
CREATE NODE datanode1 WITH (TYPE='datanode', HOST='datanode1', PORT=5433, PRIMARY);
CREATE NODE
CREATE NODE datanode2 WITH (TYPE='datanode', HOST='datanode2', PORT=5433, PREFERRED);
CREATE NODE
SELECT pgxc_pool_reload();
 pgxc_pool_reload 
------------------
 t
(1 row)

Done.
EXECUTE DIRECT ON (datanode1) 'CREATE NODE coord1 WITH (TYPE=''coordinator'', HOST=''datanode1'', PORT=5432)';
EXECUTE DIRECT
EXECUTE DIRECT ON (datanode1) 'CREATE NODE coord2 WITH (TYPE=''coordinator'', HOST=''datanode2'', PORT=5432)';
EXECUTE DIRECT
EXECUTE DIRECT ON (datanode1) 'ALTER NODE datanode1 WITH (TYPE=''datanode'', HOST=''datanode1'', PORT=5433, PRIMARY, PREFERRED)';
EXECUTE DIRECT
EXECUTE DIRECT ON (datanode1) 'CREATE NODE datanode2 WITH (TYPE=''datanode'', HOST=''datanode2'', PORT=5433, PREFERRED)';
EXECUTE DIRECT
EXECUTE DIRECT ON (datanode1) 'SELECT pgxc_pool_reload()';
 pgxc_pool_reload 
------------------
 t
(1 row)

EXECUTE DIRECT ON (datanode2) 'CREATE NODE coord1 WITH (TYPE=''coordinator'', HOST=''datanode1'', PORT=5432)';
EXECUTE DIRECT
EXECUTE DIRECT ON (datanode2) 'CREATE NODE coord2 WITH (TYPE=''coordinator'', HOST=''datanode2'', PORT=5432)';
EXECUTE DIRECT
EXECUTE DIRECT ON (datanode2) 'CREATE NODE datanode1 WITH (TYPE=''datanode'', HOST=''datanode1'', PORT=5433, PRIMARY, PREFERRED)';
EXECUTE DIRECT
EXECUTE DIRECT ON (datanode2) 'ALTER NODE datanode2 WITH (TYPE=''datanode'', HOST=''datanode2'', PORT=5433, PREFERRED)';
EXECUTE DIRECT
EXECUTE DIRECT ON (datanode2) 'SELECT pgxc_pool_reload()';
 pgxc_pool_reload 
------------------
 t
(1 row)

Done.

以後啟動，直接執行如下命令：

[[email protected] pgxc_ctl]$ pgxc_ctl -c /home/postgres/pgxc_ctl/pgxc_ctl.conf start all

停止叢集如下：

[[email protected] pgxc_ctl]$ pgxc_ctl -c /home/postgres/pgxc_ctl/pgxc_ctl.conf stop all

這幾個主要命令暫時這麼多，更多請從pgxc_ctl --help中獲取更多資訊。

五 Postgres-XL叢集測試

5.1 插入資料

在datanode1節點，執行psql -p 5432進入資料庫操作。

[[email protected]]$ psql -p 5432
psql (PGXL 10alpha1, based on PG 10beta1 (Postgres-XL 10alpha1))
Type "help" for help.

postgres=# select * from pgxc_node;
 node_name | node_type | node_port | node_host | nodeis_primary | nodeis_preferred |   node_id   
-----------+-----------+-----------+-----------+----------------+------------------+-------------
 coord1    | C         |      5432 | datanode1 | f              | f                |  1885696643
 coord2    | C         |      5432 | datanode2 | f              | f                | -1197102633
 datanode1 | D         |      5433 | datanode1 | t              | t                |   888802358
 datanode2 | D         |      5433 | datanode2 | f              | f                |  -905831925
(4 rows)
postgres=# create table test1(id it,name text);
postgres=#  insert into test1(id,name) select generate_series(1,8),'測試';

5.2 檢視資料分佈

在datanode1節點上檢視資料

[[email protected] ~]$ psql -p 5433
psql (PGXL 10alpha1, based on PG 10beta1 (Postgres-XL 10alpha1))
Type "help" for help.

postgres=# select * from test1;
 id | name 
----+------
  1 | 測試
  2 | 測試
  5 | 測試
  6 | 測試
  8 | 測試
(5 rows)

在datanode2節點上檢視資料

postgres=# select * from test1;
 id | name 
----+------
  3 | 測試
  4 | 測試
  7 | 測試
(3 rows)

注意：由於所有的資料節點組成了完整的資料檢視，所以一個數據節點down機，整個pgxl都啟動不了了，所以實際生產中，為了提高可用性，一定要配置資料節點的熱備以便進行故障轉移準備。

六叢集應用與管理

6.1 建表說明

REPLICATION表：各個datanode節點中，表的資料完全相同，也就是說，插入資料時，會分別在每個datanode節點插入相同資料。讀資料時，只需要讀任意一個datanode節點上的資料。
建表語法：

postgres=#  CREATE TABLE repltab (col1 int, col2 int) DISTRIBUTE BY REPLICATION;

DISTRIBUTE ：會將插入的資料，按照拆分規則，分配到不同的datanode節點中儲存，也就是sharding技術。每個datanode節點只儲存了部分資料，通過coordinate節點可以查詢完整的資料檢視。

postgres=#  CREATE TABLE disttab(col1 int, col2 int, col3 text) DISTRIBUTE BY HASH(col1);

模擬部分資料，插入測試資料：

#任意登入一個coordinate節點進行建表操作
[[email protected] ~]$  psql -h  datanode1 -p 5432 -U postgres
postgres=# INSERT INTO disttab SELECT generate_series(1,100), generate_series(101, 200), 'foo';
INSERT 0 100
postgres=# INSERT INTO repltab SELECT generate_series(1,100), generate_series(101, 200);
INSERT 0 100

檢視資料分佈結果：

#DISTRIBUTE表分佈結果
postgres=# SELECT xc_node_id, count(*) FROM disttab GROUP BY xc_node_id;
 xc_node_id | count 
------------+-------
 1148549230 |    42
 -927910690 |    58
(2 rows)
#REPLICATION表分佈結果
postgres=# SELECT count(*) FROM repltab;
 count 
-------
   100
(1 row)

檢視另一個datanode2中repltab表結果

[[email protected] pgxl9.5]$ psql -p 5433
psql (PGXL 9.5r1.3, based on PG 9.5.4 (Postgres-XL 9.5r1.3))
Type "help" for help.

postgres=# SELECT count(*) FROM repltab;
 count 
-------
   100
(1 row)

結論：REPLICATION表中，datanode1,datanode2中表是全部資料，一模一樣。而DISTRIBUTE表，資料散落近乎平均分配到了datanode1,datanode2節點中。

6.2新增datanode節點與資料重分佈

6.2.1 新增datanode節點

在gtm叢集管理節點上執行pgxc_ctl命令

[[email protected] ~]$ pgxc_ctl
/bin/bash
Installing pgxc_ctl_bash script as /home/postgres/pgxc_ctl/pgxc_ctl_bash.
Installing pgxc_ctl_bash script as /home/postgres/pgxc_ctl/pgxc_ctl_bash.
Reading configuration using /home/postgres/pgxc_ctl/pgxc_ctl_bash --home /home/postgres/pgxc_ctl --configuration /home/postgres/pgxc_ctl/pgxc_ctl.conf
Finished reading configuration.
   ******** PGXC_CTL START ***************

Current directory: /home/postgres/pgxc_ctl
PGXC

在PGXC後面執行新增資料節點命令:

Current directory: /home/postgres/pgxc_ctl
# 在伺服器datanode1上，新增一個master角色的datanode節點，名稱是dn3
# 埠號暫定5430，pool master暫定6669 ，指定好資料目錄位置，從兩個節點升級到3個節點，之後要寫3個none
# none應該是datanodeSpecificExtraConfig或者datanodeSpecificExtraPgHba配置
PGXC add datanode master dn3 datanode1 5430 6669 /home/postgres/pgxl9.5/data/nodes/dn_master3 none none none

等待新增完成後，查詢叢集節點狀態：

[[email protected] ~]$ psql -h datanode1 -p 5432 -U postgres
psql (PGXL 9.5r1.3, based on PG 9.5.4 (Postgres-XL 9.5r1.3))
Type "help" for help.

postgres=# select * from pgxc_node;
 node_name | node_type | node_port | node_host | nodeis_primary | nodeis_preferred |   node_id   
-----------+-----------+-----------+-----------+----------------+------------------+-------------
 coord1    | C         |      5432 | datanode1 | f              | f                |  1885696643
 coord2    | C         |      5432 | datanode2 | f              | f                | -1197102633
 node1     | D         |      5433 | datanode1 | f              | t                |  1148549230
 node2     | D         |      5433 | datanode2 | f              | f                |  -927910690
 dn3       | D         |      5430 | datanode1 | f              | f                |  -700122826
(5 rows)

可以發現節點新增完畢。

6.2.2 資料重分佈

之前我們的DISTRIBUTE表分佈在了node1,node2節點上，如下：

postgres=# SELECT xc_node_id, count(*) FROM disttab GROUP BY xc_node_id;
 xc_node_id | count 
------------+-------
 1148549230 |    42
 -927910690 |    58
(2 rows)

新增一個節點後，將sharding表資料重新分配到三個節點上，將repl表複製到新節點：

# 重分佈sharding表
postgres=# ALTER TABLE disttab ADD NODE (dn3);
ALTER TABLE
# 複製資料到新節點
postgres=#  ALTER TABLE repltab ADD NODE (dn3);
ALTER TABLE

檢視新的資料分佈：

postgres=# SELECT xc_node_id, count(*) FROM disttab GROUP BY xc_node_id;
 xc_node_id | count 
------------+-------
 -700122826 |    36
 -927910690 |    32
 1148549230 |    32
(3 rows)

登入dn3(新增的時候，放在了datanode1伺服器上，埠5430)節點檢視資料：

[[email protected] ~]$ psql -h datanode1 -p 5430 -U postgres
psql (PGXL 9.5r1.3, based on PG 9.5.4 (Postgres-XL 9.5r1.3))
Type "help" for help.
postgres=# select count(*) from repltab;
 count 
-------
   100
(1 row)

很明顯,通過 ALTER TABLE tt ADD NODE (dn)命令，可以將DISTRIBUTE表資料重新分佈到新節點，重分佈過程中會中斷所有事務。可以將REPLICATION表資料複製到新節點。

6.2.3 從datanode節點中回收資料

postgres=# ALTER TABLE disttab DELETE NODE (dn3);
ALTER TABLE
postgres=# ALTER TABLE repltab DELETE NODE (dn3);
ALTER TABLE

6.3 刪除資料節點

Postgresql-XL並沒有檢查將被刪除的datanode節點是否有replicated/distributed表的資料，為了資料安全，在刪除之前需要檢查下被刪除節點上的資料，有資料的話，要回收掉分配到其他節點，然後才能安全刪除。刪除資料節點分為四步驟：

查詢要刪除節點dn3的oid

postgres=#  SELECT oid, * FROM pgxc_node;
  oid  | node_name | node_type | node_port | node_host | nodeis_primary | nodeis_preferred |   node_id   
-------+-----------+-----------+-----------+-----------+----------------+------------------+-------------
 11819 | coord1    | C         |      5432 | datanode1 | f              | f                |  1885696643
 16384 | coord2    | C         |      5432 | datanode2 | f              | f                | -1197102633
 16385 | node1     | D         |      5433 | datanode1 | f              | t                |  1148549230
 16386 | node2     | D         |      5433 | datanode2 | f              | f                |  -927910690
 16397 | dn3       | D         |      5430 | datanode1 | f              | f                |  -700122826
(5 rows)

查詢dn3對應的oid中是否有資料

testdb=# SELECT * FROM pgxc_class WHERE nodeoids::integer[] @> ARRAY[16397];
 pcrelid | pclocatortype | pcattnum | pchashalgorithm | pchashbuckets |     nodeoids      
---------+---------------+----------+-----------------+---------------+-------------------
   16388 | H             |        1 |               1 |          4096 | 16397 16385 16386
   16394 | R             |        0 |               0 |             0 | 16397 16385 16386
(2 rows)

有資料的先回收資料

postgres=# ALTER TABLE disttab DELETE NODE (dn3);
ALTER TABLE
postgres=# ALTER TABLE repltab DELETE NODE (dn3);
ALTER TABLE
postgres=# SELECT * FROM pgxc_class WHERE nodeoids::integer[] @> ARRAY[16397];
 pcrelid | pclocatortype | pcattnum | pchashalgorithm | pchashbuckets | nodeoids 
---------+---------------+----------+-----------------+---------------+----------
(0 rows)

安全刪除dn3

PGXC$  remove datanode master dn3 clean

6.4 coordinate節點管理

同datanode節點相似，列出語句不做測試了：

# 新增coordinate
PGXC$  add coordinator master coord3 localhost 30003 30013 $dataDirRoot/coord_master.3 none none none
# 刪除coordinate,clean選項可以將相應的資料目錄也刪除
PGXC$  remove coordinator master coord3 clean

6.5 故障切換

檢視當前資料叢集

postgres=# SELECT oid, * FROM pgxc_node;
  oid  | node_name | node_type | node_port | node_host | nodeis_primary | nodeis_preferred |   node_id   
-------+-----------+-----------+-----------+-----------+----------------+------------------+-------------
 11819 | coord1    | C         |      5432 | datanode1 | f              | f                |  1885696643
 16384 | coord2    | C         |      5432 | datanode2 | f              | f                | -1197102633
 16385 | node1     | D         |      5433 | datanode1 | f              | t                |  1148549230
 16386 | node2     | D         |      5433 | datanode2 | f              | f                |  -927910690
(4 rows)

模擬node1節點故障

PGXC$  stop -m immediate datanode master node1
Stopping datanode master node1.
Done.

測試叢集查詢

postgres=# SELECT xc_node_id, count(*) FROM disttab GROUP BY xc_node_id;
ERROR:  Failed to get pooled connections
postgres=# SELECT xc_node_id, * FROM disttab WHERE col1 = 3;
 xc_node_id | col1 | col2 | col3 
------------+------+------+------
 -927910690 |    3 |  103 | foo
(1 row)

測試發現，查詢範圍如果涉及到故障的node1節點，會報錯，而查詢的資料範圍不在node1上的話，仍然可以查詢。

手動切換node1的slave

PGXC$  failover datanode node1
# 切換完成後，查詢叢集
postgres=# SELECT oid, * FROM pgxc_node;
  oid  | node_name | node_type | node_port | node_host | nodeis_primary | nodeis_preferred |   node_id   
-------+-----------+-----------+-----------+-----------+----------------+------------------+-------------
 11819 |

 
 
              
           
              
              
            
            相關推薦
			   
            
            
            
 

    

    
    關於postgresql-XL叢集的入門配置
      一 Postgres-XL簡介https://www.postgres-xl.org/documentation/tutorial-createcluster.html 這是官網的配置，只是一臺機器的1gtm + 2 coord + 2 node 都在一臺機器上
Postgres的-XL是一個基於Postgr 

  
 

    

    
    CEPH叢集操作入門--配置
       
 
 https://www.cnblogs.com/luxiaodai/p/10006036.ht 
 CEPH叢集操作入門--配置l 
 CEPH叢集操作入門--配置 
  
 閱讀目錄(Content) 
 
  概述 
  配置 
   
    儲存裝置 
     
      概述 
    

  
 

    

    
    Hadoop快速入門  -- (Hadoop叢集的配置及啟動 含編譯軟體)
      
							
							
							hadoop:
hdfs叢集:負責檔案讀寫
yarn叢集:負責為mapreduce程式分配運算硬體資源
name node 本身的地位是很重要的,它記錄了使用者上傳的檔案分別在哪些dat node 上,記錄了這些檔案的元資訊.所以它叫名稱節點,記錄了檔案的名稱和 

  
 

    

    
    PostgreSQL-XL, PostgreSQL叢集專案(二)
      Frequently Asked Questions Q. What does XL stand for?
XL is short for eXtensible Lattice. It also connotes an extra large version of PostgreSQL, in this ca 

  
 

    

    
    SpringMVC框架入門配置 IDEA下搭建Maven項目
      eclipse   資源文件   比較   erl   zip   str   c項目   下載   參觀   ,初衷：本人初學SpringMVC的時候遇到各種稀奇古怪的問題，網上各種技術論壇上的帖子又參差不齊，難以一步到位達到配置好的效果，這裏我將我配置的總結寫到這裏供大家初學SpringMVC的同僚們共同 

  
 

    

    
    struts2簡單入門-配置文件說明
      default   fig   常用   res   ftw   apach   入門   eth   核心   
struts.xml
作用:配置struts中的action對象.
基本文件格式:


<?xml version="1.0" encoding="UTF-8"?>
<!DOC 

  
 

    

    
    webpack快速入門——配置文件：服務和熱更新
      並且   base   復制   暴露   span   port   錯誤   tput   自己的    
1.在終端安裝 cnpm i webpack-dev-server --save-dev
2.配置好後執行 webpack-dev-server，這時候會報錯

出現錯誤，只需要在pagejson裏 

  
 

    

    
    SaltStack快速入門-配置管理
      入門   crazy   安裝   logs   ots   過程   名稱   devel   eba   1：定義遠程配置時描述位置，salt配置用的是一種yaml的描述語法,saltstack也是可以分環境的，比如測試環境、生產環境，默認是base，base也是必須存在的，修改內容如下：
file_ro 

  
 

    

    
    1. PostgreSQL-安裝和基本配置（學習筆記）
      安裝和配置   日常使用   buffer   java、   note   安裝完成   for   ora   har   1 PostgreSQL簡介1.1 概述??PostgreSQL數據庫是目前功能最強大的開源數據庫，支持豐富的數據類型（如JSON和JSONB類型，數組類型）和自定義類型。而且它提供 

  
 

    

    
    spring boot之入門配置（一）
      麻煩   config   src   符號   pos   files   分享圖片   PE   strong   yml、properties配置文件
　　yml相比properties配置文件，yml可以省略不必要的前綴，並且看起來更加的有層次感。推薦使用yml文件。
　　
 
@Value
　　根據 

  
 

    

    
    Hbase_入門配置
      mas   文件   apps   code   分布式   png   als   ges   htm   配置HBASE：

因為Hbase要依賴zookeeper集群和Hadoop集群，所以在配置hbase之前先確保已經配置zookeeper集群和Hadoop集群，這裏不再詳解zookeeper和had 

  
 

    

    
    node fs元件入門(配置預設路徑)
       
 
 新增資料夾和檔案 
 var http=require("http");
//匯入檔案操作元件fs
var fs=require("fs");
var url=require("url");

http.createServer(function(req,res){
	//讀取兩次的原因是第一次先會 

  
 

    

    
    遠端連線linux虛擬機器以及叢集節點配置
       
 
       上次轉載過一個遠端連線虛擬機器的連線，自己跟著上面的操作，將網路介面卡選擇的是本地，結果發現ping不通外網。哎，真的是要被自己蠢哭。。。 
     這次總結一下解決方法，還有叢集節點IP地址的配置。克隆模板機的生成。 
   

  
 

    

    
    ETCD叢集安裝配置及簡單應用
       
 
 環境配置 
 
  
  
 CentOS Linux release 7.3.1611 (Core)  
 etcd-v3.2.6 
  
 192.168.108.128 節點1 
 192.168.108.129 節點2 
 192.168.108.130 節點3 
  
 ETCD 

  
 

    

    
    FastDFS高可用叢集架構配置搭建
       
 
 
   
  
 一、基本模組及高可用架構 
 FastDFS 是餘慶老師開發的一個開源的高效能分散式檔案系統（DFS）。 它的主要功能包括：檔案儲存，檔案同步和檔案訪問，以及高容量和負載平衡。  FastDFS 系統有三個角色：跟蹤伺服器(Tracker Server)、儲存伺服器(St 

  
 

    

    
    LoadRunner 叢集測試配置
      
 從機設定：
 

 1. 在開始選單中的目錄中，找到Agent Configuration
 

 
 
 
 

 2. 點選，並啟動
 

  
 

 主機設定：
 

 1. 在測試場景中（HP LoadRunner Controller)中，點選LoadGenerators,彈出設定： 

  
 

    

    
    linux redis最新官方叢集安裝配置教程
       
 
 
 
  Redis叢集介紹
  Redis 叢集是一個提供在多個Redis間節點間共享資料的程式集。
  Redis叢集並不支援處理多個keys的命令,因為這需要在不同的節點間移動資料,從而達不到像Redis那樣的效能,在高負載的情況下可能會導致不可預料的錯誤.
  Redis 叢集通過分割槽來提 

  
 

    

    
    redis叢集介紹、redis叢集搭建配置、redis叢集操作
      一：redis叢集介紹 
多個redis節點網路互聯，資料共享所有的節點都是一主一從（可以是多個從），其中從不提供服務，僅作為備用不支援同時處理多個鍵（如mset/mget），因為redis需要把鍵均勻分佈在各個節點上，併發量很高的情況下同時建立鍵值會降低效能並導致不可預測的行為。支援線上增加、刪除節點客戶端 

  
 

    

    
    zookeeper本地叢集節點配置
       
 
  
  
 1.下載zookeeper 
 Index of /apache/zookeeper 
 2.直接解壓 
 3.配置本地叢集檔案 
 分別在zookeeper的data目錄下新建三個資料夾z1、z2、z3代表zookeeper三個節點：   
  
  每個z1、z2、z3資料夾下新建對 

  
 

    

    
    Elk叢集安裝+配置（Elasticsearch+Logstash+Filebeat+Kafka+Kibana）
       
  
  
 一、部署環境 1.基礎環境： 
  
   
    
    軟體 
    版本 
    作用 
    
   
   
    
    Linux 
    Centos7.1，16g 
     
    
    
    Jdk 
    1.8.0_151