1. 程式人生 > 實用技巧 >使用Docker部署Redis Cluster 高可用測試環境

使用Docker部署Redis Cluster 高可用測試環境

背景:

之前通過Docker部署了Redis單例項。本文要實現的通過docker來部署 6節點(3主3從)的 Cluster 高可用測試環境。

環境及配置:

  1. 建立相關目錄

[root@localhost dir_redis_cluster]# tree 
.
├── m7026
│ ├── data
│ │ ├── appendonly.aof
│ │ ├── dump.rdb
│ │ ├── nodes-7021.conf
│ │ ├── redis_1.log
│ │ ├── redis_1.pid
│ │ └── redis_1.sock
│ └── redis.cnf
├── m7027
│ ├── data
│ │ ├── appendonly.aof
│ │ ├── dump.rdb
│ │ ├── nodes
-7021.conf │ │ ├── redis_1.log │ │ ├── redis_1.pid │ │ └── redis_1.sock │ └── redis.cnf ├── m7028 │ ├── data │ │ ├── appendonly.aof │ │ ├── dump.rdb │ │ ├── nodes-7021.conf │ │ ├── redis_1.log │ │ ├── redis_1.pid │ │ └── redis_1.sock │ └── redis.cnf ├── s7026 │ ├── data │ │ ├── appendonly.aof │ │ ├── dump.rdb │ │ ├── nodes
-7021.conf │ │ ├── redis_1.log │ │ ├── redis_1.pid │ │ └── redis_1.sock │ └── redis.cnf ├── s7027 │ ├── data │ │ ├── appendonly.aof │ │ ├── dump.rdb │ │ ├── nodes-7021.conf │ │ ├── redis_1.log │ │ ├── redis_1.pid │ │ └── redis_1.sock │ └── redis.cnf └── s7028 ├── data │ ├── appendonly.aof │ ├── dump.rdb │ ├── nodes
-7021.conf │ ├── redis_1.log │ ├── redis_1.pid │ └── redis_1.sock └── redis.cnf
View Code

2. 修改redis配置檔案redis.cnf,在此前的基礎上加入叢集配置即可

################################ REDIS CLUSTER ###############################
#叢集開關,預設是不開啟叢集模式。
cluster-enabled yes

#叢集配置檔案的名稱,每個節點都有一個叢集相關的配置檔案,持久化儲存叢集的資訊。這個檔案並不需要手動配置,這個配置檔案有Redis生成並更新,每個Redis叢集節點需要一個單獨的配置檔案,請確保與例項執行的系統中配置檔名稱不衝突
cluster-config-file nodes-7021.conf

#節點互連超時的閥值。叢集節點超時毫秒數
cluster-node-timeout 30000

#在進行故障轉移的時候,全部slave都會請求申請為master,但是有些slave可能與master斷開連線一段時間了,導致資料過於陳舊,這樣的slave不應該被提升>為master。該引數就是用來判斷slave節點與master斷線的時間是否過長。判斷方法是:
#比較slave斷開連線的時間和(node-timeout * slave-validity-factor) + repl-ping-slave-period
#如果節點超時時間為三十秒, 並且slave-validity-factor為10,假設預設的repl-ping-slave-period是10秒,即如果超過310秒slave將不會嘗試進行故障轉移
#可能出現由於某主節點失聯卻沒有從節點能頂上的情況,從而導致叢集不能正常工作,在這種情況下,只有等到原來的主節點重新迴歸到叢集,叢集才恢復運作
#如果設定成0,則無論從節點與主節點失聯多久,從節點都會嘗試升級成主節
cluster-slave-validity-factor 10

#master的slave數量大於該值,slave才能遷移到其他孤立master上,如這個引數若被設為2,那麼只有當一個主節點擁有2 個可工作的從節點時,它的一個從節>點會嘗試遷移。
#主節點需要的最小從節點數,只有達到這個數,主節點失敗時,它從節點才會進行遷移。
# cluster-migration-barrier 1

#預設情況下,叢集全部的slot有節點分配,叢集狀態才為ok,才能提供服務。設定為no,可以在slot沒有全部分配的時候提供服務。不建議開啟該配置,這樣會造成分割槽的時候,小分割槽的master一直在接受寫請求,而造成很長時間資料不一致。
#在部分key所在的節點不可用時,如果此引數設定為”yes”(預設值), 則整個叢集停止接受操作;如果此引數設定為”no”,則叢集依然為可達節點上的key提供讀>操作
cluster-require-full-coverage yes

完整的配置檔案如下:

[root@localhost dir_redis_cluster]# cat m7026/redis.cnf 
daemonize no
protected-mode yes
pidfile "/data/data/redis_1.pid"
port 7026
tcp-backlog 511
bind 0.0.0.0
unixsocket "/data/data/redis_1.sock"
timeout 0
tcp-keepalive 0
loglevel notice
logfile "/data/data/redis_1.log"
databases 16
stop-writes-on-bgsave-error yes
rdbcompression yes
rdbchecksum no
dbfilename "dump.rdb"
dir "/data/data"
masterauth "redis"
slave-serve-stale-data yes
slave-read-only yes
repl-diskless-sync no
repl-diskless-sync-delay 5
repl-ping-slave-period 5
repl-timeout 60
repl-disable-tcp-nodelay no
repl-backlog-size 32mb
repl-backlog-ttl 3600
slave-priority 100
requirepass "redis"
rename-command FLUSHDB REDIS_FLUSHDB
rename-command FLUSHALL REDIS_FLUSHALL
rename-command KEYS REDIS_KEYS
maxmemory 128mb
maxmemory-policy allkeys-lru
appendonly yes
appendfilename "appendonly.aof"
appendfsync everysec
no-appendfsync-on-rewrite yes
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
aof-load-truncated yes
lua-time-limit 5000
slowlog-log-slower-than 10000
slowlog-max-len 1000
latency-monitor-threshold 0
notify-keyspace-events "e"
hash-max-ziplist-entries 512
hash-max-ziplist-value 64
list-max-ziplist-entries 512
list-max-ziplist-value 64
set-max-intset-entries 512
zset-max-ziplist-entries 128
zset-max-ziplist-value 64
hll-sparse-max-bytes 3000
activerehashing yes
client-output-buffer-limit normal 0 0 0
client-output-buffer-limit slave 256mb 64mb 60
client-output-buffer-limit pubsub 32mb 8mb 60
hz 10
aof-rewrite-incremental-fsync yes
# Generated by CONFIG REWRITE

################################ REDIS CLUSTER ###############################
#叢集開關,預設是不開啟叢集模式。
cluster-enabled yes

#叢集配置檔案的名稱,每個節點都有一個叢集相關的配置檔案,持久化儲存叢集的資訊。這個檔案並不需要手動配置,這個配置檔案有Redis生成並更新,每個Redis叢集節點需要一個單獨的配置檔案,請確保與例項執行的系統中配置檔名稱不衝突
cluster-config-file nodes-7021.conf

#節點互連超時的閥值。叢集節點超時毫秒數
cluster-node-timeout 30000

#在進行故障轉移的時候,全部slave都會請求申請為master,但是有些slave可能與master斷開連線一段時間了,導致資料過於陳舊,這樣的slave不應該被提升>為master。該引數就是用來判斷slave節點與master斷線的時間是否過長。判斷方法是:
#比較slave斷開連線的時間和(node-timeout * slave-validity-factor) + repl-ping-slave-period
#如果節點超時時間為三十秒, 並且slave-validity-factor為10,假設預設的repl-ping-slave-period是10秒,即如果超過310秒slave將不會嘗試進行故障轉移
#可能出現由於某主節點失聯卻沒有從節點能頂上的情況,從而導致叢集不能正常工作,在這種情況下,只有等到原來的主節點重新迴歸到叢集,叢集才恢復運作
#如果設定成0,則無論從節點與主節點失聯多久,從節點都會嘗試升級成主節
cluster-slave-validity-factor 10

#master的slave數量大於該值,slave才能遷移到其他孤立master上,如這個引數若被設為2,那麼只有當一個主節點擁有2 個可工作的從節點時,它的一個從節>點會嘗試遷移。
#主節點需要的最小從節點數,只有達到這個數,主節點失敗時,它從節點才會進行遷移。
# cluster-migration-barrier 1

#預設情況下,叢集全部的slot有節點分配,叢集狀態才為ok,才能提供服務。設定為no,可以在slot沒有全部分配的時候提供服務。不建議開啟該配置,這樣會造成分割槽的時候,小分割槽的master一直在接受寫請求,而造成很長時間資料不一致。
#在部分key所在的節點不可用時,如果此引數設定為”yes”(預設值), 則整個叢集停止接受操作;如果此引數設定為”no”,則叢集依然為可達節點上的key提供讀>操作
cluster-require-full-coverage yes
View Code

3. 建立docker-compose配置檔案:

[root@localhost data]# cat compose_redis_cluster.yaml 
version: '2'
networks:
  redisnet:
    external: true
#    ipam:
#      driver: default
#      config:
#        - subnet: 162.29.0.0/24
services:
  redis_m7026:
    #build:
    #  context: .
    #  dockerfile: Dockerfile
    image:  redis:4.0.8
    container_name: redis_m7026
    command: redis-server /data/redis.cnf 
    #restart: always
    #environment:
    #  MYSQL_ROOT_PASSWORD: 12345678

    ports:
      - 17026:7026
    volumes:
      - /data/data/dir_redis_cluster/m7026:/data
    networks: 
      redisnet:
        ipv4_address: 162.29.0.26


  redis_m7027:
    #build:
    #  context: .
    #  dockerfile: Dockerfile
    image:  redis:4.0.8
    container_name: redis_m7027
    command: redis-server /data/redis.cnf 
    #restart: always
    #environment:
    #  MYSQL_ROOT_PASSWORD: 12345678

    ports:
      - 17027:7027
    volumes:
      - /data/data/dir_redis_cluster/m7027:/data
    networks: 
      redisnet:
        ipv4_address: 162.29.0.27



  redis_m7028:
    #build:
    #  context: .
    #  dockerfile: Dockerfile
    image:  redis:4.0.8
    container_name: redis_m7028
    command: redis-server /data/redis.cnf 
    #restart: always
    #environment:
    #  MYSQL_ROOT_PASSWORD: 12345678

    ports:
      - 17028:7028
    volumes:
      - /data/data/dir_redis_cluster/m7028:/data
    networks: 
      redisnet:
        ipv4_address: 162.29.0.28

  redis_s7026:
    #build:
    #  context: .
    #  dockerfile: Dockerfile
    image:  redis:4.0.8
    container_name: redis_s7026
    command: redis-server /data/redis.cnf 
    #restart: always
    #environment:
    #  MYSQL_ROOT_PASSWORD: 12345678

    ports:
      - 27026:7026
    volumes:
      - /data/data/dir_redis_cluster/s7026:/data
    networks: 
      redisnet:
        ipv4_address: 162.29.0.126
 
  redis_s7027:
    #build:
    #  context: .
    #  dockerfile: Dockerfile
    image:  redis:4.0.8
    container_name: redis_s7027
    command: redis-server /data/redis.cnf 
    #restart: always
    #environment:
    #  MYSQL_ROOT_PASSWORD: 12345678

    ports:
      - 27027:7027
    volumes:
      - /data/data/dir_redis_cluster/s7027:/data
    networks: 
      redisnet:
        ipv4_address: 162.29.0.127
        
  redis_s7028:
    #build:
    #  context: .
    #  dockerfile: Dockerfile
    image:  redis:4.0.8
    container_name: redis_s7028
    command: redis-server /data/redis.cnf 
    #restart: always
    #environment:
    #  MYSQL_ROOT_PASSWORD: 12345678

    ports:
      - 27028:7028
    volumes:
      - /data/data/dir_redis_cluster/s7028:/data
    networks: 
      redisnet:
        ipv4_address: 162.29.0.128
View Code

安裝

1. 啟動

docker-compose -f compose_redis_cluster.yaml

2. 檢視啟動是否成功

#刪除 docker-compose -f compose_redis_cluster.yaml rm

[root@localhost data]# docker-compose -f compose_redis_cluster.yaml ps 

   Name                  Command               State                 Ports              
----------------------------------------------------------------------------------------
redis_m7026   docker-entrypoint.sh redis ...   Up      6379/tcp, 0.0.0.0:17026->7026/tcp
redis_m7027   docker-entrypoint.sh redis ...   Up      6379/tcp, 0.0.0.0:17027->7027/tcp
redis_m7028   docker-entrypoint.sh redis ...   Up      6379/tcp, 0.0.0.0:17028->7028/tcp
redis_s7026   docker-entrypoint.sh redis ...   Up      6379/tcp, 0.0.0.0:27026->7026/tcp
redis_s7027   docker-entrypoint.sh redis ...   Up      6379/tcp, 0.0.0.0:27027->7027/tcp
redis_s7028   docker-entrypoint.sh redis ...   Up      6379/tcp, 0.0.0.0:27028->7028/tcp

3. 部署前,檢視叢集的狀態

檢視叢集的狀態:
[root@localhost data]# docker exec -it 7c0f844b2998 /bin/bash
root@7c0f844b2998:/data# redis-cli -h 127.0.0.1 -p 7028

127.0.0.1:7028> cluster info 
cluster_state:fail                      ### 狀態是失敗
cluster_slots_assigned:0
cluster_slots_ok:0
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:1
cluster_size:0
cluster_current_epoch:0
cluster_my_epoch:0
cluster_stats_messages_sent:0
cluster_stats_messages_received:0

5. 執行create 建立叢集

#執行後報錯,老版本的建立方式
./redis-trib.rb create --replicas 1 162.29.0.26:7026 162.29.0.27:7027 162.29.0.28:7028 162.29.0.126:7026   162.29.0.127:7027  162.29.0.128:7028 

WARNING: redis-trib.rb is not longer available!
You should use redis-cli instead.

All commands and features belonging to redis-trib.rb have been moved
to redis-cli.
In order to use them you should call redis-cli with the --cluster
option followed by the subcommand name, arguments and options.

Use the following syntax:
redis-cli --cluster SUBCOMMAND [ARGUMENTS] [OPTIONS]

Example:
redis-cli --cluster create 162.29.0.26:7026 162.29.0.27:7027 162.29.0.28:7028 162.29.0.126:7026 162.29.0.26:7026 162.29.0.127:7027 162.29.0.128:7028 --cluster-replicas 1

To get help about all subcommands, type:
redis-cli --cluster help

[root@localhost data]# redis-cli --cluster help

#本例中,用新方式建立
[root@localhost data]# redis-cli --cluster create 162.29.0.26:7026 162.29.0.27:7027 162.29.0.28:7028 162.29.0.126:7026  162.29.0.127:7027 162.29.0.128:7028 --cluster-replicas 1 -a redis 
>>> Performing hash slots allocation on 6 nodes...
Master[0] -> Slots 0 - 5460
Master[1] -> Slots 5461 - 10922
Master[2] -> Slots 10923 - 16383
Adding replica 162.29.0.127:7027 to 162.29.0.26:7026
Adding replica 162.29.0.128:7028 to 162.29.0.27:7027
Adding replica 162.29.0.126:7026 to 162.29.0.28:7028
M: ecca804919cd72765814b87c1e91bb5189f9814f 162.29.0.26:7026
   slots:[0-5460] (5461 slots) master
M: c96261692b546c5d9b5a45b8596c65461b6acd1a 162.29.0.27:7027
   slots:[5461-10922] (5462 slots) master
M: b4776a521a9c88c5eef4c76822a33fb0e23a4fac 162.29.0.28:7028
   slots:[10923-16383] (5461 slots) master
S: 868cfd0950e603077616c11fa54b457bffe170ba 162.29.0.126:7026
   replicates b4776a521a9c88c5eef4c76822a33fb0e23a4fac
S: e00e5d5403fa099491883059a0e4ebc9cd9a801e 162.29.0.127:7027
   replicates ecca804919cd72765814b87c1e91bb5189f9814f
S: 2632e6bee15a0aac561c797d882e3f1215f1ff33 162.29.0.128:7028
   replicates c96261692b546c5d9b5a45b8596c65461b6acd1a
Can I set the above configuration? (type 'yes' to accept): yes
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join
......
>>> Performing Cluster Check (using node 162.29.0.26:7026)
M: ecca804919cd72765814b87c1e91bb5189f9814f 162.29.0.26:7026
   slots:[0-5460] (5461 slots) master
   1 additional replica(s)
M: c96261692b546c5d9b5a45b8596c65461b6acd1a 162.29.0.27:7027
   slots:[5461-10922] (5462 slots) master
   1 additional replica(s)
S: 868cfd0950e603077616c11fa54b457bffe170ba 162.29.0.126:7026
   slots: (0 slots) slave
   replicates b4776a521a9c88c5eef4c76822a33fb0e23a4fac
S: e00e5d5403fa099491883059a0e4ebc9cd9a801e 162.29.0.127:7027
   slots: (0 slots) slave
   replicates ecca804919cd72765814b87c1e91bb5189f9814f
M: b4776a521a9c88c5eef4c76822a33fb0e23a4fac 162.29.0.28:7028
   slots:[10923-16383] (5461 slots) master
   1 additional replica(s)
S: 2632e6bee15a0aac561c797d882e3f1215f1ff33 162.29.0.128:7028
   slots: (0 slots) slave
   replicates c96261692b546c5d9b5a45b8596c65461b6acd1a
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

6. 檢查建立後集群的狀態

[root@localhost data]# redis-cli --cluster check 162.29.0.26:7026 -a redis
162.29.0.26:7026 (ecca8049...) -> 0 keys | 5461 slots | 1 slaves.
162.29.0.27:7027 (c9626169...) -> 0 keys | 5462 slots | 1 slaves.
162.29.0.28:7028 (b4776a52...) -> 0 keys | 5461 slots | 1 slaves.
[OK] 0 keys in 3 masters.
0.00 keys per slot on average.
>>> Performing Cluster Check (using node 162.29.0.26:7026)
M: ecca804919cd72765814b87c1e91bb5189f9814f 162.29.0.26:7026
   slots:[0-5460] (5461 slots) master
   1 additional replica(s)
M: c96261692b546c5d9b5a45b8596c65461b6acd1a 162.29.0.27:7027
   slots:[5461-10922] (5462 slots) master
   1 additional replica(s)
S: 868cfd0950e603077616c11fa54b457bffe170ba 162.29.0.126:7026
   slots: (0 slots) slave
   replicates b4776a521a9c88c5eef4c76822a33fb0e23a4fac
S: e00e5d5403fa099491883059a0e4ebc9cd9a801e 162.29.0.127:7027
   slots: (0 slots) slave
   replicates ecca804919cd72765814b87c1e91bb5189f9814f
M: b4776a521a9c88c5eef4c76822a33fb0e23a4fac 162.29.0.28:7028
   slots:[10923-16383] (5461 slots) master
   1 additional replica(s)
S: 2632e6bee15a0aac561c797d882e3f1215f1ff33 162.29.0.128:7028
   slots: (0 slots) slave
   replicates c96261692b546c5d9b5a45b8596c65461b6acd1a
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.


[root@localhost data]# redis-cli --cluster info  162.29.0.26:7026 -a redis
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
162.29.0.26:7026 (ecca8049...) -> 0 keys | 5461 slots | 1 slaves.
162.29.0.27:7027 (c9626169...) -> 0 keys | 5462 slots | 1 slaves.
162.29.0.28:7028 (b4776a52...) -> 0 keys | 5461 slots | 1 slaves.
[OK] 0 keys in 3 masters.
0.00 keys per slot on average.
        

7. 登入一臺redis節點,檢視叢集資訊

[root@localhost data]# redis-cli -h 162.29.0.26 -p 7026 -a redis
162.29.0.26:7026> info cluster
# Cluster
cluster_enabled:1
162.29.0.26:7026> 
162.29.0.26:7026> cluster info 
cluster_state:ok                  ###### 叢集狀態正常
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6            ####### 一共6個節點
cluster_size:3                   ####### 有3個主例項
cluster_current_epoch:6
cluster_my_epoch:1
cluster_stats_messages_ping_sent:454
cluster_stats_messages_pong_sent:471
cluster_stats_messages_sent:925
cluster_stats_messages_ping_received:466
cluster_stats_messages_pong_received:454
cluster_stats_messages_meet_received:5
cluster_stats_messages_received:925
        
162.29.0.26:7026> cluster nodes
ecca804919cd72765814b87c1e91bb5189f9814f 162.29.0.26:7026@17026 myself,master - 0 1595960257000 1 connected 0-5460
c96261692b546c5d9b5a45b8596c65461b6acd1a 162.29.0.27:7027@17027 master - 0 1595960259352 2 connected 5461-10922
868cfd0950e603077616c11fa54b457bffe170ba 162.29.0.126:7026@17026 slave b4776a521a9c88c5eef4c76822a33fb0e23a4fac 0 1595960258000 4 connected
e00e5d5403fa099491883059a0e4ebc9cd9a801e 162.29.0.127:7027@17027 slave ecca804919cd72765814b87c1e91bb5189f9814f 0 1595960259000 5 connected
b4776a521a9c88c5eef4c76822a33fb0e23a4fac 162.29.0.28:7028@17028 master - 0 1595960257334 3 connected 10923-16383
2632e6bee15a0aac561c797d882e3f1215f1ff33 162.29.0.128:7028@17028 slave c96261692b546c5d9b5a45b8596c65461b6acd1a 0 1595960257000 6 connected

新加節點測試

增加節點,複製配置檔案,重新啟動兩個節點:

docker run -itd  --name addm7026 -v /data/data/dir_redis_cluster/add_m7026:/data  --net redisnet -p 17029:7026 --ip 162.29.0.29 redis:4.0.8 redis-server /data/redis.cnf

docker run -itd  --name adds7026 -v /data/data/dir_redis_cluster/add_s7026:/data  --net redisnet -p 27029:7026 --ip 162.29.0.129 redis:4.0.8 redis-server /data/redis.cnf

新增主節點162.29.0.29:7026

####### 老版本新增方式
#./redis-trib.rb add-node 192.168.100.134:17022 192.168.100.134:17021 [root@localhost add_s7026]# redis-cli --cluster add-node 162.29.0.29:7026 162.29.0.26:7026 -a redis Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe. >>> Adding node 162.29.0.29:7026 to cluster 162.29.0.26:7026 >>> Send CLUSTER MEET to node 162.29.0.29:7026 to make it join the cluster. [OK] New node added correctly. #檢視新加的節點 [root@localhost add_s7026]# redis-cli --cluster info 162.29.0.26:7026 -a redis 162.29.0.26:7026 (ecca8049...) -> 0 keys | 5461 slots | 1 slaves. 162.29.0.27:7027 (c9626169...) -> 0 keys | 5462 slots | 1 slaves. 162.29.0.29:7026 (addfe99e...) -> 0 keys | 0 slots | 0 slaves. 162.29.0.28:7028 (b4776a52...) -> 0 keys | 5461 slots | 1 slaves. [OK] 0 keys in 4 masters. 0.00 keys per slot on average.

新增從節點162.29.0.129:7026 為剛新增節點的從庫, cluster-master-id 可以用redis-cli --cluster check 來檢視

####### 老版本新增方式
#./redis-trib.rb add-node --slave --master-id 7fa64d250b595d8ac21a42477af5ac8c07c35d83 192.168.100.135:17022 192.168.100.134:17021

[root@localhost add_s7026]# redis-cli --cluster add-node 162.29.0.129:7026 162.29.0.26:7026 -a redis --cluster-slave --cluster-master-id  addfe99eb1187be11d474b0034e59081a418db0f
>>> Adding node 162.29.0.129:7026 to cluster 162.29.0.26:7026

>>> Send CLUSTER MEET to node 162.29.0.129:7026 to make it join the cluster.
Waiting for the cluster to join

>>> Configure node as replica of 162.29.0.29:7026.

遷移部分資料到新節點

[root@localhost add_s7026]# redis-cli --cluster reshard  162.29.0.26:7026 -a redis

How many slots do you want to move (from 1 to 16384)? 10                    #####輸入要遷移的slot數
What is the receiving node ID? addfe99eb1187be11d474b0034e59081a418db0f     ###### 輸入新節點的id 
Please enter all the source node IDs.
  Type 'all' to use all the nodes as source nodes for the hash slots.
  Type 'done' once you entered all the source nodes IDs.
Source node #1: all                                                         ###### all表示從其他的master節點中一共遷移10個slot到新節點,每個master節點各遷移幾個

Ready to move 10 slots.
  Source nodes:
    M: ecca804919cd72765814b87c1e91bb5189f9814f 162.29.0.26:7026
       slots:[0-5460] (5461 slots) master
       1 additional replica(s)
    M: c96261692b546c5d9b5a45b8596c65461b6acd1a 162.29.0.27:7027
       slots:[5461-10922] (5462 slots) master
       1 additional replica(s)
    M: b4776a521a9c88c5eef4c76822a33fb0e23a4fac 162.29.0.28:7028
       slots:[10923-16383] (5461 slots) master
       1 additional replica(s)
  Destination node:
    M: addfe99eb1187be11d474b0034e59081a418db0f 162.29.0.29:7026
       slots: (0 slots) master
       1 additional replica(s)
  Resharding plan:
    Moving slot 5461 from c96261692b546c5d9b5a45b8596c65461b6acd1a
    Moving slot 5462 from c96261692b546c5d9b5a45b8596c65461b6acd1a
    Moving slot 5463 from c96261692b546c5d9b5a45b8596c65461b6acd1a
    Moving slot 5464 from c96261692b546c5d9b5a45b8596c65461b6acd1a
    Moving slot 0 from ecca804919cd72765814b87c1e91bb5189f9814f
    Moving slot 1 from ecca804919cd72765814b87c1e91bb5189f9814f
    Moving slot 2 from ecca804919cd72765814b87c1e91bb5189f9814f
    Moving slot 10923 from b4776a521a9c88c5eef4c76822a33fb0e23a4fac
    Moving slot 10924 from b4776a521a9c88c5eef4c76822a33fb0e23a4fac
    Moving slot 10925 from b4776a521a9c88c5eef4c76822a33fb0e23a4fac
Do you want to proceed with the proposed reshard plan (yes/no)? yes
  Moving slot 5461 from 162.29.0.27:7027 to 162.29.0.29:7026: 
  Moving slot 5462 from 162.29.0.27:7027 to 162.29.0.29:7026: 
  Moving slot 5463 from 162.29.0.27:7027 to 162.29.0.29:7026: 
  Moving slot 5464 from 162.29.0.27:7027 to 162.29.0.29:7026: 
  Moving slot 0 from 162.29.0.26:7026 to 162.29.0.29:7026: 
  Moving slot 1 from 162.29.0.26:7026 to 162.29.0.29:7026: 
  Moving slot 2 from 162.29.0.26:7026 to 162.29.0.29:7026: 
  Moving slot 10923 from 162.29.0.28:7028 to 162.29.0.29:7026: 
  Moving slot 10924 from 162.29.0.28:7028 to 162.29.0.29:7026: 
  Moving slot 10925 from 162.29.0.28:7028 to 162.29.0.29:7026: 

[root@localhost add_s7026]# redis-cli --cluster reshard  162.29.0.26:7026 -a redis

How many slots do you want to move (from 1 to 16384)? 10
What is the receiving node ID? addfe99eb1187be11d474b0034e59081a418db0f
Please enter all the source node IDs.
  Type 'all' to use all the nodes as source nodes for the hash slots.
  Type 'done' once you entered all the source nodes IDs.
Source node #1: ecca804919cd72765814b87c1e91bb5189f9814f                  ###### 輸入源節點id,在輸入done。表示從該節點遷移10個slot到新節點
Source node #2: done

Ready to move 10 slots.
  Source nodes:
    M: ecca804919cd72765814b87c1e91bb5189f9814f 162.29.0.26:7026
       slots:[3-5460] (5458 slots) master
       1 additional replica(s)
  Destination node:
    M: addfe99eb1187be11d474b0034e59081a418db0f 162.29.0.29:7026
       slots:[0-2],[5461-5464],[10923-10925] (10 slots) master
       1 additional replica(s)
  Resharding plan:
    Moving slot 3 from ecca804919cd72765814b87c1e91bb5189f9814f
    Moving slot 4 from ecca804919cd72765814b87c1e91bb5189f9814f
    Moving slot 5 from ecca804919cd72765814b87c1e91bb5189f9814f
    Moving slot 6 from ecca804919cd72765814b87c1e91bb5189f9814f
    Moving slot 7 from ecca804919cd72765814b87c1e91bb5189f9814f
    Moving slot 8 from ecca804919cd72765814b87c1e91bb5189f9814f
    Moving slot 9 from ecca804919cd72765814b87c1e91bb5189f9814f
    Moving slot 10 from ecca804919cd72765814b87c1e91bb5189f9814f
    Moving slot 11 from ecca804919cd72765814b87c1e91bb5189f9814f
    Moving slot 12 from ecca804919cd72765814b87c1e91bb5189f9814f
Do you want to proceed with the proposed reshard plan (yes/no)? yes
  Moving slot 3 from 162.29.0.26:7026 to 162.29.0.29:7026: 
  Moving slot 4 from 162.29.0.26:7026 to 162.29.0.29:7026: 
  Moving slot 5 from 162.29.0.26:7026 to 162.29.0.29:7026: 
  Moving slot 6 from 162.29.0.26:7026 to 162.29.0.29:7026: 
  Moving slot 7 from 162.29.0.26:7026 to 162.29.0.29:7026: 
  Moving slot 8 from 162.29.0.26:7026 to 162.29.0.29:7026: 
  Moving slot 9 from 162.29.0.26:7026 to 162.29.0.29:7026: 
  Moving slot 10 from 162.29.0.26:7026 to 162.29.0.29:7026: 
  Moving slot 11 from 162.29.0.26:7026 to 162.29.0.29:7026: 
  Moving slot 12 from 162.29.0.26:7026 to 162.29.0.29:7026: 

此時資料分佈不均勻,可以reblance一下

[root@localhost add_s7026]# redis-cli --cluster rebalance  162.29.0.26:7026 -a redis

>>> Rebalancing across 4 nodes. Total weight = 4.00
Moving 1362 slots from 162.29.0.28:7028 to 162.29.0.29:7026
Moving 1362 slots from 162.29.0.27:7027 to 162.29.0.29:7026
Moving 1352 slots from 162.29.0.26:7026 to 162.29.0.29:7026

叢集故障轉移

1. 模擬裝置宕機後,自動故障轉移

kill掉其中一個例項:

### 模擬主庫master3故障
[root@localhost add_s7026]# docker kill 92595c9ca5e0

### 檢視日誌

## master1日誌
[root@localhost dir_redis_cluster]# tail -f m7026/data/redis_1.log 

1:M 29 Jul 10:38:30.696 * Marking node b4776a521a9c88c5eef4c76822a33fb0e23a4fac as failing (quorum reached).
1:M 29 Jul 10:38:30.696 # Cluster state changed: fail
1:M 29 Jul 10:38:31.510 # Failover auth granted to 868cfd0950e603077616c11fa54b457bffe170ba for epoch 8
1:M 29 Jul 10:38:31.514 # Cluster state changed: ok

## master2 日誌
[root@localhost dir_redis_cluster]# tail -f m7027/data/redis_1.log 

1:M 29 Jul 10:38:30.698 * FAIL message received from ecca804919cd72765814b87c1e91bb5189f9814f about b4776a521a9c88c5eef4c76822a33fb0e23a4fac
1:M 29 Jul 10:38:30.698 # Cluster state changed: fail
1:M 29 Jul 10:38:31.510 # Failover auth granted to 868cfd0950e603077616c11fa54b457bffe170ba for epoch 8
1:M 29 Jul 10:38:31.515 # Cluster state changed: ok

## slave1 日誌
[root@localhost dir_redis_cluster]# tail -f s7026/data/redis_1.log 

1:S 29 Jul 10:38:30.696 * FAIL message received from ecca804919cd72765814b87c1e91bb5189f9814f about b4776a521a9c88c5eef4c76822a33fb0e23a4fac
1:S 29 Jul 10:38:30.696 # Cluster state changed: fail
1:S 29 Jul 10:38:30.794 # Start of election delayed for 646 milliseconds (rank #0, offset 13832).
1:S 29 Jul 10:38:31.508 # Starting a failover election for epoch 8.
1:S 29 Jul 10:38:31.511 # Failover election won: I'm the new master.
1:S 29 Jul 10:38:31.511 # configEpoch set to 8 after successful failover
1:M 29 Jul 10:38:31.511 # Setting secondary replication ID to adc9070f7647046e340d29e4808dec78a3fa68ba, valid up to offset: 13833. New replication ID is fc257229c6d869841043ee7aefc9af00e656eb57
1:M 29 Jul 10:38:31.511 * Discarding previously cached master state.
1:M 29 Jul 10:38:31.511 # Cluster state changed: ok

## salve2 日誌
[root@localhost dir_redis_cluster]# tail -f s7027/data/redis_1.log 

1:S 29 Jul 10:38:30.698 * FAIL message received from ecca804919cd72765814b87c1e91bb5189f9814f about b4776a521a9c88c5eef4c76822a33fb0e23a4fac
1:S 29 Jul 10:38:30.698 # Cluster state changed: fail
1:S 29 Jul 10:38:31.513 # Cluster state changed: ok

## salve3 日誌
[root@localhost dir_redis_cluster]# tail -f s7028/data/redis_1.log 

1:S 29 Jul 10:38:30.697 * FAIL message received from ecca804919cd72765814b87c1e91bb5189f9814f about b4776a521a9c88c5eef4c76822a33fb0e23a4fac
1:S 29 Jul 10:38:30.697 # Cluster state changed: fail
1:S 29 Jul 10:38:31.513 # Cluster state changed: ok

檢視叢集狀態:

[root@localhost add_s7026]# redis-cli --cluster check   162.29.0.26:7026 -a redis
Could not connect to Redis at 162.29.0.28:7028: No route to host
162.29.0.26:7026 (ecca8049...) -> 0 keys | 4096 slots | 1 slaves.
162.29.0.27:7027 (c9626169...) -> 0 keys | 4096 slots | 1 slaves.
162.29.0.29:7026 (addfe99e...) -> 0 keys | 4096 slots | 1 slaves.
162.29.0.126:7026 (868cfd09...) -> 0 keys | 4096 slots | 0 slaves.   ### 0.126 切換為主節點
[OK] 0 keys in 4 masters.
0.00 keys per slot on average.
>>> Performing Cluster Check (using node 162.29.0.26:7026)
M: ecca804919cd72765814b87c1e91bb5189f9814f 162.29.0.26:7026
   slots:[1365-5460] (4096 slots) master
   1 additional replica(s)
S: c7d5176346824f59cd01e481e4fd69fb5e015eed 162.29.0.129:7026
   slots: (0 slots) slave
   replicates addfe99eb1187be11d474b0034e59081a418db0f
S: e00e5d5403fa099491883059a0e4ebc9cd9a801e 162.29.0.127:7027
   slots: (0 slots) slave
   replicates ecca804919cd72765814b87c1e91bb5189f9814f
M: c96261692b546c5d9b5a45b8596c65461b6acd1a 162.29.0.27:7027
   slots:[6827-10922] (4096 slots) master
   1 additional replica(s)
M: addfe99eb1187be11d474b0034e59081a418db0f 162.29.0.29:7026
   slots:[0-1364],[5461-6826],[10923-12287] (4096 slots) master
   1 additional replica(s)
M: 868cfd0950e603077616c11fa54b457bffe170ba 162.29.0.126:7026
   slots:[12288-16383] (4096 slots) master
S: 2632e6bee15a0aac561c797d882e3f1215f1ff33 162.29.0.128:7028
   slots: (0 slots) slave
   replicates c96261692b546c5d9b5a45b8596c65461b6acd1a
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

重新啟動原來的主例項,自動加入叢集。變成了slave節點

2.手動故障轉移

指定從節點發起轉移流程,主從 節點角色進行切換,從節點變為新的主節點對外提供服務,舊的主節點變為 它的從節點
cluster failover -- 手動提升從為主節點
·cluster failover force ——用於當主節點宕機且無法自動完成故障轉移情 況。從節點接到cluster failover force請求時,從節點直接發起選舉,不再跟 主節點確認複製偏移量(從節點複製延遲的資料會丟失),當從節點選舉成 功後替換為新的主節點並廣播叢集配置。

cluster failover takeover ——用於叢集內超過一半以上主節點故障的場 景,因為從節點無法收到半數以上主節點投票,所以無法完成選舉過程。可以執行cluster failover takeover強制轉移,takeover故障轉移由於沒有通 過領導者選舉發起故障轉移,會導致配置紀元存在衝突的可能

手動故障轉移時,在滿足當前需求的情況下建議優先順序:cluster failver>cluster failover force>cluster failover takeover

## 登入從節點0.28:7028 執行手動切換
162.29.0.28:7028> cluster failover

## 檢視叢集狀態

[root@localhost add_s7026]# redis-cli --cluster check 162.29.0.26:7026 -a redis
162.29.0.26:7026 (ecca8049...) -> 0 keys | 4096 slots | 1 slaves.
162.29.0.27:7027 (c9626169...) -> 0 keys | 4096 slots | 1 slaves.
162.29.0.29:7026 (addfe99e...) -> 0 keys | 4096 slots | 1 slaves.
162.29.0.28:7028 (b4776a52...) -> 0 keys | 4096 slots | 1 slaves.     ### 0.28:7028 變成了主節點
[OK] 0 keys in 4 masters.
0.00 keys per slot on average.
>>> Performing Cluster Check (using node 162.29.0.26:7026)
M: ecca804919cd72765814b87c1e91bb5189f9814f 162.29.0.26:7026
   slots:[1365-5460] (4096 slots) master
   1 additional replica(s)
S: c7d5176346824f59cd01e481e4fd69fb5e015eed 162.29.0.129:7026
   slots: (0 slots) slave
   replicates addfe99eb1187be11d474b0034e59081a418db0f
S: e00e5d5403fa099491883059a0e4ebc9cd9a801e 162.29.0.127:7027
   slots: (0 slots) slave
   replicates ecca804919cd72765814b87c1e91bb5189f9814f
M: c96261692b546c5d9b5a45b8596c65461b6acd1a 162.29.0.27:7027
   slots:[6827-10922] (4096 slots) master
   1 additional replica(s)
M: addfe99eb1187be11d474b0034e59081a418db0f 162.29.0.29:7026
   slots:[0-1364],[5461-6826],[10923-12287] (4096 slots) master
   1 additional replica(s)
S: 868cfd0950e603077616c11fa54b457bffe170ba 162.29.0.126:7026
   slots: (0 slots) slave
   replicates b4776a521a9c88c5eef4c76822a33fb0e23a4fac
S: 2632e6bee15a0aac561c797d882e3f1215f1ff33 162.29.0.128:7028
   slots: (0 slots) slave
   replicates c96261692b546c5d9b5a45b8596c65461b6acd1a
M: b4776a521a9c88c5eef4c76822a33fb0e23a4fac 162.29.0.28:7028
   slots:[12288-16383] (4096 slots) master
   1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

下線叢集節點

### 刪除叢集節點 del-node ip:port <node_id>:只能刪除沒有分配slot的節點,從叢集中刪出之後直接關閉例項

### 刪除一個從節點
[root@localhost add_s7026]# redis-cli --cluster del-node   162.29.0.26:7026 e00e5d5403fa099491883059a0e4ebc9cd9a801e  -a redis

>>> Removing node e00e5d5403fa099491883059a0e4ebc9cd9a801e from cluster 162.29.0.26:7026 >>> Sending CLUSTER FORGET messages to the cluster... >>> Sending CLUSTER RESET SOFT to the deleted node. ### check 驗證已下線 [root@localhost add_s7026]# redis-cli --cluster check 162.29.0.26:7026 -a redis ### 把剛才的節點加進來,此時加進來會是主節點 redis-cli --cluster add-node 162.29.0.127:7027 162.29.0.26:7026 -a redis

## cluster replicate 手動改成某節點的從節點。或者加進來的時候已從節點的形式加進來

redis-cli -h 162.29.0.127 -p 7027 -a redis
>cluster replicate ecca804919cd72765814b87c1e91bb5189f9814f

cluster 命令的詳解:

CLUSTER info:列印叢集的資訊。
CLUSTER nodes:列出叢集當前已知的所有節點(node)的相關資訊。
CLUSTER meet <ip> <port>:將ip和port所指定的節點新增到叢集當中。
CLUSTER addslots <slot> [slot ...]:將一個或多個槽(slot)指派(assign)給當前節點。
CLUSTER delslots <slot> [slot ...]:移除一個或多個槽對當前節點的指派。
CLUSTER slots:列出槽位、節點資訊。
CLUSTER slaves <node_id>:列出指定節點下面的從節點資訊。
CLUSTER replicate <node_id>:將當前節點設定為指定節點的從節點。
CLUSTER saveconfig:手動執行命令儲存儲存叢集的配置檔案,叢集預設在配置修改的時候會自動儲存配置檔案。
CLUSTER keyslot <key>:列出key被放置在哪個槽上。
CLUSTER flushslots:移除指派給當前節點的所有槽,讓當前節點變成一個沒有指派任何槽的節點。
CLUSTER countkeysinslot <slot>:返回槽目前包含的鍵值對數量。
CLUSTER getkeysinslot <slot> <count>:返回count個槽中的鍵。

CLUSTER setslot <slot> node <node_id> 將槽指派給指定的節點,如果槽已經指派給另一個節點,那麼先讓另一個節點刪除該槽,然後再進行指派。  
CLUSTER setslot <slot> migrating <node_id> 將本節點的槽遷移到指定的節點中。  
CLUSTER setslot <slot> importing <node_id> 從 node_id 指定的節點中匯入槽 slot 到本節點。  
CLUSTER setslot <slot> stable 取消對槽 slot 的匯入(import)或者遷移(migrate)。 

CLUSTER failover:手動進行故障轉移。
CLUSTER forget <node_id>:從叢集中移除指定的節點,這樣就無法完成握手,過期時為60s,60s後兩節點又會繼續完成握手。
CLUSTER reset [HARD|SOFT]:重置叢集資訊,soft是清空其他節點的資訊,但不修改自己的id,hard還會修改自己的id,不傳該引數則使用soft方式。

CLUSTER count-failure-reports <node_id>:列出某個節點的故障報告的長度。
CLUSTER SET-CONFIG-EPOCH:設定節點epoch,只有在節點加入叢集前才能設定。