【ClickHouse資料庫】基於Docker構建叢集模式(多分片單備份)
阿新 • • 發佈:2020-07-20
建立容器
-
建立網路
docker network create -d bridge iot-net
-
啟動3個數據庫例項
docker run -d --name chdb1 --ulimit nofile=262144:262144 --volume=/root/iot/chdb1:/var/lib/clickhouse --publish 9001:9000 --network iot-net yandex/clickhouse-server docker run -d --name chdb2 --ulimit nofile=262144:262144 --volume=/root/iot/chdb2:/var/lib/clickhouse --publish 9002:9000 --network iot-net yandex/clickhouse-server docker run -d --name chdb3 --ulimit nofile=262144:262144 --volume=/root/iot/chdb3:/var/lib/clickhouse --publish 9003:9000 --network iot-net yandex/clickhouse-server
配置叢集模式
-
載入叢集配置檔案:先從容器中獲得配置檔案
docker cp chdb1:/etc/clickhouse-server/config.xml ./
-
在
config.xml
自定義資料分片配置中定義3分片1備份:<remote_servers> <perftest_3shards_1replicas> <shard> <replica> <host>chdb1</host> <port>9000</port> </replica> </shard> <shard> <replica> <host>chdb2</host> <port>9000</port> </replica> </shard> <shard> <replica> <host>chdb3</host> <port>9000</port> </replica> </shard> </perftest_3shards_1replicas> </remote_servers>
隨後將
config.xml
配置檔案導回至3個例項並重啟之:docker cp ./config.xml chdb1:/etc/clickhouse-server && docker cp ./config.xml chdb2:/etc/clickhouse-server && docker cp ./config.xml chdb3:/etc/clickhouse-server docker restart chdb1 chdb2 chdb3
驗證叢集
-
連線至任意例項:
clickhouse-client --port 9001
執行以下命令可看到當前叢集資訊:
e16ff05d1ca6 :) select * from system.clusters; SELECT * FROM system.clusters ┌─cluster───────────────────┬─shard_num─┬─shard_weight─┬─replica_num─┬─host_name─┬─host_address─┬─port─┬─is_local─┬─user────┬─default_database─┬─errors_count─┬─estimated_recovery_time─┐ │ cluster_3shards_1replicas │ 1 │ 1 │ 1 │ chdb1 │ 172.25.0.2 │ 9000 │ 1 │ default │ │ 0 │ 0 │ │ cluster_3shards_1replicas │ 2 │ 1 │ 1 │ chdb2 │ 172.25.0.3 │ 9000 │ 0 │ default │ │ 0 │ 0 │ │ cluster_3shards_1replicas │ 3 │ 1 │ 1 │ chdb3 │ 172.25.0.4 │ 9000 │ 0 │ default │ │ 0 │ 0 │ └───────────────────────────┴───────────┴──────────────┴─────────────┴───────────┴──────────────┴──────┴──────────┴─────────┴──────────────────┴──────────────┴─────────────────────────┘ 3 rows in set. Elapsed: 0.006 sec.
-
在3個例項中構建測試資料表:
create table population ( `ozone` Int8, `particullate_matter` Int8, `carbon_monoxide` Int8, `sulfure_dioxide` Int8, `nitrogen_dioxide` Int8, `longitude` Float64, `latitude` Float64, `timestamp` DateTime ) ENGINE = MergeTree() ORDER BY `timestamp` PRIMARY KEY `timestamp`
可以直接使用clinet建立表而不一一進入資料庫執行SQL:
clickhouse-client --port 9001 \ --query 'CREATE TABLE population ( `ozone` Int8, `particullate_matter` Int8, `carbon_monoxide` Int8, `sulfure_dioxide` Int8, `nitrogen_dioxide` Int8, `longitude` Float64, `latitude` Float64, `timestamp` DateTime ) ENGINE = MergeTree() ORDER BY `timestamp` PRIMARY KEY `timestamp`'
-
建立分佈表,分佈表可以認為是一個路由,表明了資料如何流轉至叢集中具體的某一例項:
CREATE TABLE population_all AS population ENGINE = Distributed(cluster_3shards_1replicas, default, population, rand())
-
將資料匯入到此資料庫例項的表中:
root@mq-227 ~/i/db_file# cat pollutionData204273.csv | wc -l 17568 clickhouse-client --port 9001 --query "INSERT INTO population_all FORMAT CSV" < ./pollutionData204273.csv
查詢資料表可得當前資料量:
root@mq-227 ~/i/db_file# clickhouse-client --port 9001 --query "select count(*) from population_all" 1 17568 root@mq-227 ~/i/db_file# clickhouse-client --port 9001 --query "select count(*) from population" 5955 root@mq-227 ~/i/db_file# clickhouse-client --port 9002 --query "select count(*) from population" 5690 root@mq-227 ~/i/db_file# clickhouse-client --port 9003 --query "select count(*) from population" 5923
可以看到資料已經被分配至3個分片中。