使用Docker部署Redis Cluster 高可用測試環境
背景:
之前通過Docker部署了Redis單例項。本文要實現的通過docker來部署 6節點(3主3從)的 Cluster 高可用測試環境。
環境及配置:
1. 建立相關目錄
[root@localhost dir_redis_cluster]# tree . ├── m7026 │ ├── data │ │ ├── appendonly.aof │ │ ├── dump.rdb │ │ ├── nodes-7021.conf │ │ ├── redis_1.log │ │ ├── redis_1.pid │ │ └── redis_1.sock │ └── redis.cnf ├── m7027 │ ├── data │ │ ├── appendonly.aof │ │ ├── dump.rdb │ │ ├── nodesView Code-7021.conf │ │ ├── redis_1.log │ │ ├── redis_1.pid │ │ └── redis_1.sock │ └── redis.cnf ├── m7028 │ ├── data │ │ ├── appendonly.aof │ │ ├── dump.rdb │ │ ├── nodes-7021.conf │ │ ├── redis_1.log │ │ ├── redis_1.pid │ │ └── redis_1.sock │ └── redis.cnf ├── s7026 │ ├── data │ │ ├── appendonly.aof │ │ ├── dump.rdb │ │ ├── nodes-7021.conf │ │ ├── redis_1.log │ │ ├── redis_1.pid │ │ └── redis_1.sock │ └── redis.cnf ├── s7027 │ ├── data │ │ ├── appendonly.aof │ │ ├── dump.rdb │ │ ├── nodes-7021.conf │ │ ├── redis_1.log │ │ ├── redis_1.pid │ │ └── redis_1.sock │ └── redis.cnf └── s7028 ├── data │ ├── appendonly.aof │ ├── dump.rdb │ ├── nodes-7021.conf │ ├── redis_1.log │ ├── redis_1.pid │ └── redis_1.sock └── redis.cnf
2. 修改redis配置檔案redis.cnf,在此前的基礎上加入叢集配置即可
################################ REDIS CLUSTER ############################### #叢集開關,預設是不開啟叢集模式。 cluster-enabled yes #叢集配置檔案的名稱,每個節點都有一個叢集相關的配置檔案,持久化儲存叢集的資訊。這個檔案並不需要手動配置,這個配置檔案有Redis生成並更新,每個Redis叢集節點需要一個單獨的配置檔案,請確保與例項執行的系統中配置檔名稱不衝突 cluster-config-file nodes-7021.conf #節點互連超時的閥值。叢集節點超時毫秒數 cluster-node-timeout 30000 #在進行故障轉移的時候,全部slave都會請求申請為master,但是有些slave可能與master斷開連線一段時間了,導致資料過於陳舊,這樣的slave不應該被提升>為master。該引數就是用來判斷slave節點與master斷線的時間是否過長。判斷方法是: #比較slave斷開連線的時間和(node-timeout * slave-validity-factor) + repl-ping-slave-period #如果節點超時時間為三十秒, 並且slave-validity-factor為10,假設預設的repl-ping-slave-period是10秒,即如果超過310秒slave將不會嘗試進行故障轉移 #可能出現由於某主節點失聯卻沒有從節點能頂上的情況,從而導致叢集不能正常工作,在這種情況下,只有等到原來的主節點重新迴歸到叢集,叢集才恢復運作 #如果設定成0,則無論從節點與主節點失聯多久,從節點都會嘗試升級成主節 cluster-slave-validity-factor 10 #master的slave數量大於該值,slave才能遷移到其他孤立master上,如這個引數若被設為2,那麼只有當一個主節點擁有2 個可工作的從節點時,它的一個從節>點會嘗試遷移。 #主節點需要的最小從節點數,只有達到這個數,主節點失敗時,它從節點才會進行遷移。 # cluster-migration-barrier 1 #預設情況下,叢集全部的slot有節點分配,叢集狀態才為ok,才能提供服務。設定為no,可以在slot沒有全部分配的時候提供服務。不建議開啟該配置,這樣會造成分割槽的時候,小分割槽的master一直在接受寫請求,而造成很長時間資料不一致。 #在部分key所在的節點不可用時,如果此引數設定為”yes”(預設值), 則整個叢集停止接受操作;如果此引數設定為”no”,則叢集依然為可達節點上的key提供讀>操作 cluster-require-full-coverage yes
完整的配置檔案如下:
[root@localhost dir_redis_cluster]# cat m7026/redis.cnf daemonize no protected-mode yes pidfile "/data/data/redis_1.pid" port 7026 tcp-backlog 511 bind 0.0.0.0 unixsocket "/data/data/redis_1.sock" timeout 0 tcp-keepalive 0 loglevel notice logfile "/data/data/redis_1.log" databases 16 stop-writes-on-bgsave-error yes rdbcompression yes rdbchecksum no dbfilename "dump.rdb" dir "/data/data" masterauth "redis" slave-serve-stale-data yes slave-read-only yes repl-diskless-sync no repl-diskless-sync-delay 5 repl-ping-slave-period 5 repl-timeout 60 repl-disable-tcp-nodelay no repl-backlog-size 32mb repl-backlog-ttl 3600 slave-priority 100 requirepass "redis" rename-command FLUSHDB REDIS_FLUSHDB rename-command FLUSHALL REDIS_FLUSHALL rename-command KEYS REDIS_KEYS maxmemory 128mb maxmemory-policy allkeys-lru appendonly yes appendfilename "appendonly.aof" appendfsync everysec no-appendfsync-on-rewrite yes auto-aof-rewrite-percentage 100 auto-aof-rewrite-min-size 64mb aof-load-truncated yes lua-time-limit 5000 slowlog-log-slower-than 10000 slowlog-max-len 1000 latency-monitor-threshold 0 notify-keyspace-events "e" hash-max-ziplist-entries 512 hash-max-ziplist-value 64 list-max-ziplist-entries 512 list-max-ziplist-value 64 set-max-intset-entries 512 zset-max-ziplist-entries 128 zset-max-ziplist-value 64 hll-sparse-max-bytes 3000 activerehashing yes client-output-buffer-limit normal 0 0 0 client-output-buffer-limit slave 256mb 64mb 60 client-output-buffer-limit pubsub 32mb 8mb 60 hz 10 aof-rewrite-incremental-fsync yes # Generated by CONFIG REWRITE ################################ REDIS CLUSTER ############################### #叢集開關,預設是不開啟叢集模式。 cluster-enabled yes #叢集配置檔案的名稱,每個節點都有一個叢集相關的配置檔案,持久化儲存叢集的資訊。這個檔案並不需要手動配置,這個配置檔案有Redis生成並更新,每個Redis叢集節點需要一個單獨的配置檔案,請確保與例項執行的系統中配置檔名稱不衝突 cluster-config-file nodes-7021.conf #節點互連超時的閥值。叢集節點超時毫秒數 cluster-node-timeout 30000 #在進行故障轉移的時候,全部slave都會請求申請為master,但是有些slave可能與master斷開連線一段時間了,導致資料過於陳舊,這樣的slave不應該被提升>為master。該引數就是用來判斷slave節點與master斷線的時間是否過長。判斷方法是: #比較slave斷開連線的時間和(node-timeout * slave-validity-factor) + repl-ping-slave-period #如果節點超時時間為三十秒, 並且slave-validity-factor為10,假設預設的repl-ping-slave-period是10秒,即如果超過310秒slave將不會嘗試進行故障轉移 #可能出現由於某主節點失聯卻沒有從節點能頂上的情況,從而導致叢集不能正常工作,在這種情況下,只有等到原來的主節點重新迴歸到叢集,叢集才恢復運作 #如果設定成0,則無論從節點與主節點失聯多久,從節點都會嘗試升級成主節 cluster-slave-validity-factor 10 #master的slave數量大於該值,slave才能遷移到其他孤立master上,如這個引數若被設為2,那麼只有當一個主節點擁有2 個可工作的從節點時,它的一個從節>點會嘗試遷移。 #主節點需要的最小從節點數,只有達到這個數,主節點失敗時,它從節點才會進行遷移。 # cluster-migration-barrier 1 #預設情況下,叢集全部的slot有節點分配,叢集狀態才為ok,才能提供服務。設定為no,可以在slot沒有全部分配的時候提供服務。不建議開啟該配置,這樣會造成分割槽的時候,小分割槽的master一直在接受寫請求,而造成很長時間資料不一致。 #在部分key所在的節點不可用時,如果此引數設定為”yes”(預設值), 則整個叢集停止接受操作;如果此引數設定為”no”,則叢集依然為可達節點上的key提供讀>操作 cluster-require-full-coverage yesView Code
3. 建立docker-compose配置檔案:
[root@localhost data]# cat compose_redis_cluster.yaml version: '2' networks: redisnet: external: true # ipam: # driver: default # config: # - subnet: 162.29.0.0/24 services: redis_m7026: #build: # context: . # dockerfile: Dockerfile image: redis:4.0.8 container_name: redis_m7026 command: redis-server /data/redis.cnf #restart: always #environment: # MYSQL_ROOT_PASSWORD: 12345678 ports: - 17026:7026 volumes: - /data/data/dir_redis_cluster/m7026:/data networks: redisnet: ipv4_address: 162.29.0.26 redis_m7027: #build: # context: . # dockerfile: Dockerfile image: redis:4.0.8 container_name: redis_m7027 command: redis-server /data/redis.cnf #restart: always #environment: # MYSQL_ROOT_PASSWORD: 12345678 ports: - 17027:7027 volumes: - /data/data/dir_redis_cluster/m7027:/data networks: redisnet: ipv4_address: 162.29.0.27 redis_m7028: #build: # context: . # dockerfile: Dockerfile image: redis:4.0.8 container_name: redis_m7028 command: redis-server /data/redis.cnf #restart: always #environment: # MYSQL_ROOT_PASSWORD: 12345678 ports: - 17028:7028 volumes: - /data/data/dir_redis_cluster/m7028:/data networks: redisnet: ipv4_address: 162.29.0.28 redis_s7026: #build: # context: . # dockerfile: Dockerfile image: redis:4.0.8 container_name: redis_s7026 command: redis-server /data/redis.cnf #restart: always #environment: # MYSQL_ROOT_PASSWORD: 12345678 ports: - 27026:7026 volumes: - /data/data/dir_redis_cluster/s7026:/data networks: redisnet: ipv4_address: 162.29.0.126 redis_s7027: #build: # context: . # dockerfile: Dockerfile image: redis:4.0.8 container_name: redis_s7027 command: redis-server /data/redis.cnf #restart: always #environment: # MYSQL_ROOT_PASSWORD: 12345678 ports: - 27027:7027 volumes: - /data/data/dir_redis_cluster/s7027:/data networks: redisnet: ipv4_address: 162.29.0.127 redis_s7028: #build: # context: . # dockerfile: Dockerfile image: redis:4.0.8 container_name: redis_s7028 command: redis-server /data/redis.cnf #restart: always #environment: # MYSQL_ROOT_PASSWORD: 12345678 ports: - 27028:7028 volumes: - /data/data/dir_redis_cluster/s7028:/data networks: redisnet: ipv4_address: 162.29.0.128View Code
安裝
1. 啟動
docker-compose -f compose_redis_cluster.yaml
2. 檢視啟動是否成功
#刪除 docker-compose -f compose_redis_cluster.yaml rm
[root@localhost data]# docker-compose -f compose_redis_cluster.yaml ps Name Command State Ports ---------------------------------------------------------------------------------------- redis_m7026 docker-entrypoint.sh redis ... Up 6379/tcp, 0.0.0.0:17026->7026/tcp redis_m7027 docker-entrypoint.sh redis ... Up 6379/tcp, 0.0.0.0:17027->7027/tcp redis_m7028 docker-entrypoint.sh redis ... Up 6379/tcp, 0.0.0.0:17028->7028/tcp redis_s7026 docker-entrypoint.sh redis ... Up 6379/tcp, 0.0.0.0:27026->7026/tcp redis_s7027 docker-entrypoint.sh redis ... Up 6379/tcp, 0.0.0.0:27027->7027/tcp redis_s7028 docker-entrypoint.sh redis ... Up 6379/tcp, 0.0.0.0:27028->7028/tcp
3. 部署前,檢視叢集的狀態
檢視叢集的狀態: [root@localhost data]# docker exec -it 7c0f844b2998 /bin/bash root@7c0f844b2998:/data# redis-cli -h 127.0.0.1 -p 7028 127.0.0.1:7028> cluster info cluster_state:fail ### 狀態是失敗 cluster_slots_assigned:0 cluster_slots_ok:0 cluster_slots_pfail:0 cluster_slots_fail:0 cluster_known_nodes:1 cluster_size:0 cluster_current_epoch:0 cluster_my_epoch:0 cluster_stats_messages_sent:0 cluster_stats_messages_received:0
5. 執行create 建立叢集
#執行後報錯,老版本的建立方式 ./redis-trib.rb create --replicas 1 162.29.0.26:7026 162.29.0.27:7027 162.29.0.28:7028 162.29.0.126:7026 162.29.0.127:7027 162.29.0.128:7028 WARNING: redis-trib.rb is not longer available! You should use redis-cli instead. All commands and features belonging to redis-trib.rb have been moved to redis-cli. In order to use them you should call redis-cli with the --cluster option followed by the subcommand name, arguments and options. Use the following syntax: redis-cli --cluster SUBCOMMAND [ARGUMENTS] [OPTIONS] Example: redis-cli --cluster create 162.29.0.26:7026 162.29.0.27:7027 162.29.0.28:7028 162.29.0.126:7026 162.29.0.26:7026 162.29.0.127:7027 162.29.0.128:7028 --cluster-replicas 1 To get help about all subcommands, type: redis-cli --cluster help [root@localhost data]# redis-cli --cluster help #本例中,用新方式建立 [root@localhost data]# redis-cli --cluster create 162.29.0.26:7026 162.29.0.27:7027 162.29.0.28:7028 162.29.0.126:7026 162.29.0.127:7027 162.29.0.128:7028 --cluster-replicas 1 -a redis >>> Performing hash slots allocation on 6 nodes... Master[0] -> Slots 0 - 5460 Master[1] -> Slots 5461 - 10922 Master[2] -> Slots 10923 - 16383 Adding replica 162.29.0.127:7027 to 162.29.0.26:7026 Adding replica 162.29.0.128:7028 to 162.29.0.27:7027 Adding replica 162.29.0.126:7026 to 162.29.0.28:7028 M: ecca804919cd72765814b87c1e91bb5189f9814f 162.29.0.26:7026 slots:[0-5460] (5461 slots) master M: c96261692b546c5d9b5a45b8596c65461b6acd1a 162.29.0.27:7027 slots:[5461-10922] (5462 slots) master M: b4776a521a9c88c5eef4c76822a33fb0e23a4fac 162.29.0.28:7028 slots:[10923-16383] (5461 slots) master S: 868cfd0950e603077616c11fa54b457bffe170ba 162.29.0.126:7026 replicates b4776a521a9c88c5eef4c76822a33fb0e23a4fac S: e00e5d5403fa099491883059a0e4ebc9cd9a801e 162.29.0.127:7027 replicates ecca804919cd72765814b87c1e91bb5189f9814f S: 2632e6bee15a0aac561c797d882e3f1215f1ff33 162.29.0.128:7028 replicates c96261692b546c5d9b5a45b8596c65461b6acd1a Can I set the above configuration? (type 'yes' to accept): yes >>> Nodes configuration updated >>> Assign a different config epoch to each node >>> Sending CLUSTER MEET messages to join the cluster Waiting for the cluster to join ...... >>> Performing Cluster Check (using node 162.29.0.26:7026) M: ecca804919cd72765814b87c1e91bb5189f9814f 162.29.0.26:7026 slots:[0-5460] (5461 slots) master 1 additional replica(s) M: c96261692b546c5d9b5a45b8596c65461b6acd1a 162.29.0.27:7027 slots:[5461-10922] (5462 slots) master 1 additional replica(s) S: 868cfd0950e603077616c11fa54b457bffe170ba 162.29.0.126:7026 slots: (0 slots) slave replicates b4776a521a9c88c5eef4c76822a33fb0e23a4fac S: e00e5d5403fa099491883059a0e4ebc9cd9a801e 162.29.0.127:7027 slots: (0 slots) slave replicates ecca804919cd72765814b87c1e91bb5189f9814f M: b4776a521a9c88c5eef4c76822a33fb0e23a4fac 162.29.0.28:7028 slots:[10923-16383] (5461 slots) master 1 additional replica(s) S: 2632e6bee15a0aac561c797d882e3f1215f1ff33 162.29.0.128:7028 slots: (0 slots) slave replicates c96261692b546c5d9b5a45b8596c65461b6acd1a [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered.
6. 檢查建立後集群的狀態
[root@localhost data]# redis-cli --cluster check 162.29.0.26:7026 -a redis 162.29.0.26:7026 (ecca8049...) -> 0 keys | 5461 slots | 1 slaves. 162.29.0.27:7027 (c9626169...) -> 0 keys | 5462 slots | 1 slaves. 162.29.0.28:7028 (b4776a52...) -> 0 keys | 5461 slots | 1 slaves. [OK] 0 keys in 3 masters. 0.00 keys per slot on average. >>> Performing Cluster Check (using node 162.29.0.26:7026) M: ecca804919cd72765814b87c1e91bb5189f9814f 162.29.0.26:7026 slots:[0-5460] (5461 slots) master 1 additional replica(s) M: c96261692b546c5d9b5a45b8596c65461b6acd1a 162.29.0.27:7027 slots:[5461-10922] (5462 slots) master 1 additional replica(s) S: 868cfd0950e603077616c11fa54b457bffe170ba 162.29.0.126:7026 slots: (0 slots) slave replicates b4776a521a9c88c5eef4c76822a33fb0e23a4fac S: e00e5d5403fa099491883059a0e4ebc9cd9a801e 162.29.0.127:7027 slots: (0 slots) slave replicates ecca804919cd72765814b87c1e91bb5189f9814f M: b4776a521a9c88c5eef4c76822a33fb0e23a4fac 162.29.0.28:7028 slots:[10923-16383] (5461 slots) master 1 additional replica(s) S: 2632e6bee15a0aac561c797d882e3f1215f1ff33 162.29.0.128:7028 slots: (0 slots) slave replicates c96261692b546c5d9b5a45b8596c65461b6acd1a [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered. [root@localhost data]# redis-cli --cluster info 162.29.0.26:7026 -a redis Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe. 162.29.0.26:7026 (ecca8049...) -> 0 keys | 5461 slots | 1 slaves. 162.29.0.27:7027 (c9626169...) -> 0 keys | 5462 slots | 1 slaves. 162.29.0.28:7028 (b4776a52...) -> 0 keys | 5461 slots | 1 slaves. [OK] 0 keys in 3 masters. 0.00 keys per slot on average.
7. 登入一臺redis節點,檢視叢集資訊
[root@localhost data]# redis-cli -h 162.29.0.26 -p 7026 -a redis 162.29.0.26:7026> info cluster # Cluster cluster_enabled:1 162.29.0.26:7026> 162.29.0.26:7026> cluster info cluster_state:ok ###### 叢集狀態正常 cluster_slots_assigned:16384 cluster_slots_ok:16384 cluster_slots_pfail:0 cluster_slots_fail:0 cluster_known_nodes:6 ####### 一共6個節點 cluster_size:3 ####### 有3個主例項 cluster_current_epoch:6 cluster_my_epoch:1 cluster_stats_messages_ping_sent:454 cluster_stats_messages_pong_sent:471 cluster_stats_messages_sent:925 cluster_stats_messages_ping_received:466 cluster_stats_messages_pong_received:454 cluster_stats_messages_meet_received:5 cluster_stats_messages_received:925 162.29.0.26:7026> cluster nodes ecca804919cd72765814b87c1e91bb5189f9814f 162.29.0.26:7026@17026 myself,master - 0 1595960257000 1 connected 0-5460 c96261692b546c5d9b5a45b8596c65461b6acd1a 162.29.0.27:7027@17027 master - 0 1595960259352 2 connected 5461-10922 868cfd0950e603077616c11fa54b457bffe170ba 162.29.0.126:7026@17026 slave b4776a521a9c88c5eef4c76822a33fb0e23a4fac 0 1595960258000 4 connected e00e5d5403fa099491883059a0e4ebc9cd9a801e 162.29.0.127:7027@17027 slave ecca804919cd72765814b87c1e91bb5189f9814f 0 1595960259000 5 connected b4776a521a9c88c5eef4c76822a33fb0e23a4fac 162.29.0.28:7028@17028 master - 0 1595960257334 3 connected 10923-16383 2632e6bee15a0aac561c797d882e3f1215f1ff33 162.29.0.128:7028@17028 slave c96261692b546c5d9b5a45b8596c65461b6acd1a 0 1595960257000 6 connected
新加節點測試
增加節點,複製配置檔案,重新啟動兩個節點:
docker run -itd --name addm7026 -v /data/data/dir_redis_cluster/add_m7026:/data --net redisnet -p 17029:7026 --ip 162.29.0.29 redis:4.0.8 redis-server /data/redis.cnf
docker run -itd --name adds7026 -v /data/data/dir_redis_cluster/add_s7026:/data --net redisnet -p 27029:7026 --ip 162.29.0.129 redis:4.0.8 redis-server /data/redis.cnf
新增主節點162.29.0.29:7026
####### 老版本新增方式
#./redis-trib.rb add-node 192.168.100.134:17022 192.168.100.134:17021 [root@localhost add_s7026]# redis-cli --cluster add-node 162.29.0.29:7026 162.29.0.26:7026 -a redis Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe. >>> Adding node 162.29.0.29:7026 to cluster 162.29.0.26:7026 >>> Send CLUSTER MEET to node 162.29.0.29:7026 to make it join the cluster. [OK] New node added correctly. #檢視新加的節點 [root@localhost add_s7026]# redis-cli --cluster info 162.29.0.26:7026 -a redis 162.29.0.26:7026 (ecca8049...) -> 0 keys | 5461 slots | 1 slaves. 162.29.0.27:7027 (c9626169...) -> 0 keys | 5462 slots | 1 slaves. 162.29.0.29:7026 (addfe99e...) -> 0 keys | 0 slots | 0 slaves. 162.29.0.28:7028 (b4776a52...) -> 0 keys | 5461 slots | 1 slaves. [OK] 0 keys in 4 masters. 0.00 keys per slot on average.
新增從節點162.29.0.129:7026 為剛新增節點的從庫, cluster-master-id 可以用redis-cli --cluster check 來檢視
####### 老版本新增方式
#./redis-trib.rb add-node --slave --master-id 7fa64d250b595d8ac21a42477af5ac8c07c35d83 192.168.100.135:17022 192.168.100.134:17021
[root@localhost add_s7026]# redis-cli --cluster add-node 162.29.0.129:7026 162.29.0.26:7026 -a redis --cluster-slave --cluster-master-id addfe99eb1187be11d474b0034e59081a418db0f >>> Adding node 162.29.0.129:7026 to cluster 162.29.0.26:7026 >>> Send CLUSTER MEET to node 162.29.0.129:7026 to make it join the cluster. Waiting for the cluster to join >>> Configure node as replica of 162.29.0.29:7026.
遷移部分資料到新節點
[root@localhost add_s7026]# redis-cli --cluster reshard 162.29.0.26:7026 -a redis How many slots do you want to move (from 1 to 16384)? 10 #####輸入要遷移的slot數 What is the receiving node ID? addfe99eb1187be11d474b0034e59081a418db0f ###### 輸入新節點的id Please enter all the source node IDs. Type 'all' to use all the nodes as source nodes for the hash slots. Type 'done' once you entered all the source nodes IDs. Source node #1: all ###### all表示從其他的master節點中一共遷移10個slot到新節點,每個master節點各遷移幾個 Ready to move 10 slots. Source nodes: M: ecca804919cd72765814b87c1e91bb5189f9814f 162.29.0.26:7026 slots:[0-5460] (5461 slots) master 1 additional replica(s) M: c96261692b546c5d9b5a45b8596c65461b6acd1a 162.29.0.27:7027 slots:[5461-10922] (5462 slots) master 1 additional replica(s) M: b4776a521a9c88c5eef4c76822a33fb0e23a4fac 162.29.0.28:7028 slots:[10923-16383] (5461 slots) master 1 additional replica(s) Destination node: M: addfe99eb1187be11d474b0034e59081a418db0f 162.29.0.29:7026 slots: (0 slots) master 1 additional replica(s) Resharding plan: Moving slot 5461 from c96261692b546c5d9b5a45b8596c65461b6acd1a Moving slot 5462 from c96261692b546c5d9b5a45b8596c65461b6acd1a Moving slot 5463 from c96261692b546c5d9b5a45b8596c65461b6acd1a Moving slot 5464 from c96261692b546c5d9b5a45b8596c65461b6acd1a Moving slot 0 from ecca804919cd72765814b87c1e91bb5189f9814f Moving slot 1 from ecca804919cd72765814b87c1e91bb5189f9814f Moving slot 2 from ecca804919cd72765814b87c1e91bb5189f9814f Moving slot 10923 from b4776a521a9c88c5eef4c76822a33fb0e23a4fac Moving slot 10924 from b4776a521a9c88c5eef4c76822a33fb0e23a4fac Moving slot 10925 from b4776a521a9c88c5eef4c76822a33fb0e23a4fac Do you want to proceed with the proposed reshard plan (yes/no)? yes Moving slot 5461 from 162.29.0.27:7027 to 162.29.0.29:7026: Moving slot 5462 from 162.29.0.27:7027 to 162.29.0.29:7026: Moving slot 5463 from 162.29.0.27:7027 to 162.29.0.29:7026: Moving slot 5464 from 162.29.0.27:7027 to 162.29.0.29:7026: Moving slot 0 from 162.29.0.26:7026 to 162.29.0.29:7026: Moving slot 1 from 162.29.0.26:7026 to 162.29.0.29:7026: Moving slot 2 from 162.29.0.26:7026 to 162.29.0.29:7026: Moving slot 10923 from 162.29.0.28:7028 to 162.29.0.29:7026: Moving slot 10924 from 162.29.0.28:7028 to 162.29.0.29:7026: Moving slot 10925 from 162.29.0.28:7028 to 162.29.0.29:7026: [root@localhost add_s7026]# redis-cli --cluster reshard 162.29.0.26:7026 -a redis How many slots do you want to move (from 1 to 16384)? 10 What is the receiving node ID? addfe99eb1187be11d474b0034e59081a418db0f Please enter all the source node IDs. Type 'all' to use all the nodes as source nodes for the hash slots. Type 'done' once you entered all the source nodes IDs. Source node #1: ecca804919cd72765814b87c1e91bb5189f9814f ###### 輸入源節點id,在輸入done。表示從該節點遷移10個slot到新節點 Source node #2: done Ready to move 10 slots. Source nodes: M: ecca804919cd72765814b87c1e91bb5189f9814f 162.29.0.26:7026 slots:[3-5460] (5458 slots) master 1 additional replica(s) Destination node: M: addfe99eb1187be11d474b0034e59081a418db0f 162.29.0.29:7026 slots:[0-2],[5461-5464],[10923-10925] (10 slots) master 1 additional replica(s) Resharding plan: Moving slot 3 from ecca804919cd72765814b87c1e91bb5189f9814f Moving slot 4 from ecca804919cd72765814b87c1e91bb5189f9814f Moving slot 5 from ecca804919cd72765814b87c1e91bb5189f9814f Moving slot 6 from ecca804919cd72765814b87c1e91bb5189f9814f Moving slot 7 from ecca804919cd72765814b87c1e91bb5189f9814f Moving slot 8 from ecca804919cd72765814b87c1e91bb5189f9814f Moving slot 9 from ecca804919cd72765814b87c1e91bb5189f9814f Moving slot 10 from ecca804919cd72765814b87c1e91bb5189f9814f Moving slot 11 from ecca804919cd72765814b87c1e91bb5189f9814f Moving slot 12 from ecca804919cd72765814b87c1e91bb5189f9814f Do you want to proceed with the proposed reshard plan (yes/no)? yes Moving slot 3 from 162.29.0.26:7026 to 162.29.0.29:7026: Moving slot 4 from 162.29.0.26:7026 to 162.29.0.29:7026: Moving slot 5 from 162.29.0.26:7026 to 162.29.0.29:7026: Moving slot 6 from 162.29.0.26:7026 to 162.29.0.29:7026: Moving slot 7 from 162.29.0.26:7026 to 162.29.0.29:7026: Moving slot 8 from 162.29.0.26:7026 to 162.29.0.29:7026: Moving slot 9 from 162.29.0.26:7026 to 162.29.0.29:7026: Moving slot 10 from 162.29.0.26:7026 to 162.29.0.29:7026: Moving slot 11 from 162.29.0.26:7026 to 162.29.0.29:7026: Moving slot 12 from 162.29.0.26:7026 to 162.29.0.29:7026:
此時資料分佈不均勻,可以reblance一下
[root@localhost add_s7026]# redis-cli --cluster rebalance 162.29.0.26:7026 -a redis >>> Rebalancing across 4 nodes. Total weight = 4.00 Moving 1362 slots from 162.29.0.28:7028 to 162.29.0.29:7026 Moving 1362 slots from 162.29.0.27:7027 to 162.29.0.29:7026 Moving 1352 slots from 162.29.0.26:7026 to 162.29.0.29:7026
叢集故障轉移
1. 模擬裝置宕機後,自動故障轉移
kill掉其中一個例項:
### 模擬主庫master3故障 [root@localhost add_s7026]# docker kill 92595c9ca5e0 ### 檢視日誌 ## master1日誌 [root@localhost dir_redis_cluster]# tail -f m7026/data/redis_1.log 1:M 29 Jul 10:38:30.696 * Marking node b4776a521a9c88c5eef4c76822a33fb0e23a4fac as failing (quorum reached). 1:M 29 Jul 10:38:30.696 # Cluster state changed: fail 1:M 29 Jul 10:38:31.510 # Failover auth granted to 868cfd0950e603077616c11fa54b457bffe170ba for epoch 8 1:M 29 Jul 10:38:31.514 # Cluster state changed: ok ## master2 日誌 [root@localhost dir_redis_cluster]# tail -f m7027/data/redis_1.log 1:M 29 Jul 10:38:30.698 * FAIL message received from ecca804919cd72765814b87c1e91bb5189f9814f about b4776a521a9c88c5eef4c76822a33fb0e23a4fac 1:M 29 Jul 10:38:30.698 # Cluster state changed: fail 1:M 29 Jul 10:38:31.510 # Failover auth granted to 868cfd0950e603077616c11fa54b457bffe170ba for epoch 8 1:M 29 Jul 10:38:31.515 # Cluster state changed: ok ## slave1 日誌 [root@localhost dir_redis_cluster]# tail -f s7026/data/redis_1.log 1:S 29 Jul 10:38:30.696 * FAIL message received from ecca804919cd72765814b87c1e91bb5189f9814f about b4776a521a9c88c5eef4c76822a33fb0e23a4fac 1:S 29 Jul 10:38:30.696 # Cluster state changed: fail 1:S 29 Jul 10:38:30.794 # Start of election delayed for 646 milliseconds (rank #0, offset 13832). 1:S 29 Jul 10:38:31.508 # Starting a failover election for epoch 8. 1:S 29 Jul 10:38:31.511 # Failover election won: I'm the new master. 1:S 29 Jul 10:38:31.511 # configEpoch set to 8 after successful failover 1:M 29 Jul 10:38:31.511 # Setting secondary replication ID to adc9070f7647046e340d29e4808dec78a3fa68ba, valid up to offset: 13833. New replication ID is fc257229c6d869841043ee7aefc9af00e656eb57 1:M 29 Jul 10:38:31.511 * Discarding previously cached master state. 1:M 29 Jul 10:38:31.511 # Cluster state changed: ok ## salve2 日誌 [root@localhost dir_redis_cluster]# tail -f s7027/data/redis_1.log 1:S 29 Jul 10:38:30.698 * FAIL message received from ecca804919cd72765814b87c1e91bb5189f9814f about b4776a521a9c88c5eef4c76822a33fb0e23a4fac 1:S 29 Jul 10:38:30.698 # Cluster state changed: fail 1:S 29 Jul 10:38:31.513 # Cluster state changed: ok ## salve3 日誌 [root@localhost dir_redis_cluster]# tail -f s7028/data/redis_1.log 1:S 29 Jul 10:38:30.697 * FAIL message received from ecca804919cd72765814b87c1e91bb5189f9814f about b4776a521a9c88c5eef4c76822a33fb0e23a4fac 1:S 29 Jul 10:38:30.697 # Cluster state changed: fail 1:S 29 Jul 10:38:31.513 # Cluster state changed: ok
檢視叢集狀態:
[root@localhost add_s7026]# redis-cli --cluster check 162.29.0.26:7026 -a redis Could not connect to Redis at 162.29.0.28:7028: No route to host 162.29.0.26:7026 (ecca8049...) -> 0 keys | 4096 slots | 1 slaves. 162.29.0.27:7027 (c9626169...) -> 0 keys | 4096 slots | 1 slaves. 162.29.0.29:7026 (addfe99e...) -> 0 keys | 4096 slots | 1 slaves. 162.29.0.126:7026 (868cfd09...) -> 0 keys | 4096 slots | 0 slaves. ### 0.126 切換為主節點 [OK] 0 keys in 4 masters. 0.00 keys per slot on average. >>> Performing Cluster Check (using node 162.29.0.26:7026) M: ecca804919cd72765814b87c1e91bb5189f9814f 162.29.0.26:7026 slots:[1365-5460] (4096 slots) master 1 additional replica(s) S: c7d5176346824f59cd01e481e4fd69fb5e015eed 162.29.0.129:7026 slots: (0 slots) slave replicates addfe99eb1187be11d474b0034e59081a418db0f S: e00e5d5403fa099491883059a0e4ebc9cd9a801e 162.29.0.127:7027 slots: (0 slots) slave replicates ecca804919cd72765814b87c1e91bb5189f9814f M: c96261692b546c5d9b5a45b8596c65461b6acd1a 162.29.0.27:7027 slots:[6827-10922] (4096 slots) master 1 additional replica(s) M: addfe99eb1187be11d474b0034e59081a418db0f 162.29.0.29:7026 slots:[0-1364],[5461-6826],[10923-12287] (4096 slots) master 1 additional replica(s) M: 868cfd0950e603077616c11fa54b457bffe170ba 162.29.0.126:7026 slots:[12288-16383] (4096 slots) master S: 2632e6bee15a0aac561c797d882e3f1215f1ff33 162.29.0.128:7028 slots: (0 slots) slave replicates c96261692b546c5d9b5a45b8596c65461b6acd1a [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered.
重新啟動原來的主例項,自動加入叢集。變成了slave節點
2.手動故障轉移
指定從節點發起轉移流程,主從 節點角色進行切換,從節點變為新的主節點對外提供服務,舊的主節點變為 它的從節點
cluster failover -- 手動提升從為主節點
·cluster failover force ——用於當主節點宕機且無法自動完成故障轉移情 況。從節點接到cluster failover force請求時,從節點直接發起選舉,不再跟 主節點確認複製偏移量(從節點複製延遲的資料會丟失),當從節點選舉成 功後替換為新的主節點並廣播叢集配置。
cluster failover takeover ——用於叢集內超過一半以上主節點故障的場 景,因為從節點無法收到半數以上主節點投票,所以無法完成選舉過程。可以執行cluster failover takeover強制轉移,takeover故障轉移由於沒有通 過領導者選舉發起故障轉移,會導致配置紀元存在衝突的可能
手動故障轉移時,在滿足當前需求的情況下建議優先順序:cluster failver>cluster failover force>cluster failover takeover
## 登入從節點0.28:7028 執行手動切換 162.29.0.28:7028> cluster failover
## 檢視叢集狀態
[root@localhost add_s7026]# redis-cli --cluster check 162.29.0.26:7026 -a redis
162.29.0.26:7026 (ecca8049...) -> 0 keys | 4096 slots | 1 slaves. 162.29.0.27:7027 (c9626169...) -> 0 keys | 4096 slots | 1 slaves. 162.29.0.29:7026 (addfe99e...) -> 0 keys | 4096 slots | 1 slaves. 162.29.0.28:7028 (b4776a52...) -> 0 keys | 4096 slots | 1 slaves. ### 0.28:7028 變成了主節點 [OK] 0 keys in 4 masters. 0.00 keys per slot on average. >>> Performing Cluster Check (using node 162.29.0.26:7026) M: ecca804919cd72765814b87c1e91bb5189f9814f 162.29.0.26:7026 slots:[1365-5460] (4096 slots) master 1 additional replica(s) S: c7d5176346824f59cd01e481e4fd69fb5e015eed 162.29.0.129:7026 slots: (0 slots) slave replicates addfe99eb1187be11d474b0034e59081a418db0f S: e00e5d5403fa099491883059a0e4ebc9cd9a801e 162.29.0.127:7027 slots: (0 slots) slave replicates ecca804919cd72765814b87c1e91bb5189f9814f M: c96261692b546c5d9b5a45b8596c65461b6acd1a 162.29.0.27:7027 slots:[6827-10922] (4096 slots) master 1 additional replica(s) M: addfe99eb1187be11d474b0034e59081a418db0f 162.29.0.29:7026 slots:[0-1364],[5461-6826],[10923-12287] (4096 slots) master 1 additional replica(s) S: 868cfd0950e603077616c11fa54b457bffe170ba 162.29.0.126:7026 slots: (0 slots) slave replicates b4776a521a9c88c5eef4c76822a33fb0e23a4fac S: 2632e6bee15a0aac561c797d882e3f1215f1ff33 162.29.0.128:7028 slots: (0 slots) slave replicates c96261692b546c5d9b5a45b8596c65461b6acd1a M: b4776a521a9c88c5eef4c76822a33fb0e23a4fac 162.29.0.28:7028 slots:[12288-16383] (4096 slots) master 1 additional replica(s) [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered.
下線叢集節點
### 刪除叢集節點 del-node ip:port <node_id>:只能刪除沒有分配slot的節點,從叢集中刪出之後直接關閉例項 ### 刪除一個從節點 [root@localhost add_s7026]# redis-cli --cluster del-node 162.29.0.26:7026 e00e5d5403fa099491883059a0e4ebc9cd9a801e -a redis
>>> Removing node e00e5d5403fa099491883059a0e4ebc9cd9a801e from cluster 162.29.0.26:7026 >>> Sending CLUSTER FORGET messages to the cluster... >>> Sending CLUSTER RESET SOFT to the deleted node. ### check 驗證已下線 [root@localhost add_s7026]# redis-cli --cluster check 162.29.0.26:7026 -a redis ### 把剛才的節點加進來,此時加進來會是主節點 redis-cli --cluster add-node 162.29.0.127:7027 162.29.0.26:7026 -a redis
## cluster replicate 手動改成某節點的從節點。或者加進來的時候已從節點的形式加進來
redis-cli -h 162.29.0.127 -p 7027 -a redis
>cluster replicate ecca804919cd72765814b87c1e91bb5189f9814f
cluster 命令的詳解:
CLUSTER info:列印叢集的資訊。 CLUSTER nodes:列出叢集當前已知的所有節點(node)的相關資訊。 CLUSTER meet <ip> <port>:將ip和port所指定的節點新增到叢集當中。 CLUSTER addslots <slot> [slot ...]:將一個或多個槽(slot)指派(assign)給當前節點。 CLUSTER delslots <slot> [slot ...]:移除一個或多個槽對當前節點的指派。 CLUSTER slots:列出槽位、節點資訊。 CLUSTER slaves <node_id>:列出指定節點下面的從節點資訊。 CLUSTER replicate <node_id>:將當前節點設定為指定節點的從節點。 CLUSTER saveconfig:手動執行命令儲存儲存叢集的配置檔案,叢集預設在配置修改的時候會自動儲存配置檔案。 CLUSTER keyslot <key>:列出key被放置在哪個槽上。 CLUSTER flushslots:移除指派給當前節點的所有槽,讓當前節點變成一個沒有指派任何槽的節點。 CLUSTER countkeysinslot <slot>:返回槽目前包含的鍵值對數量。 CLUSTER getkeysinslot <slot> <count>:返回count個槽中的鍵。 CLUSTER setslot <slot> node <node_id> 將槽指派給指定的節點,如果槽已經指派給另一個節點,那麼先讓另一個節點刪除該槽,然後再進行指派。 CLUSTER setslot <slot> migrating <node_id> 將本節點的槽遷移到指定的節點中。 CLUSTER setslot <slot> importing <node_id> 從 node_id 指定的節點中匯入槽 slot 到本節點。 CLUSTER setslot <slot> stable 取消對槽 slot 的匯入(import)或者遷移(migrate)。 CLUSTER failover:手動進行故障轉移。 CLUSTER forget <node_id>:從叢集中移除指定的節點,這樣就無法完成握手,過期時為60s,60s後兩節點又會繼續完成握手。 CLUSTER reset [HARD|SOFT]:重置叢集資訊,soft是清空其他節點的資訊,但不修改自己的id,hard還會修改自己的id,不傳該引數則使用soft方式。 CLUSTER count-failure-reports <node_id>:列出某個節點的故障報告的長度。 CLUSTER SET-CONFIG-EPOCH:設定節點epoch,只有在節點加入叢集前才能設定。