1. 程式人生 > >redis-3.0.1 sentinel 主從高可用 詳細配置

redis-3.0.1 sentinel 主從高可用 詳細配置

最近專案上線部署,要求redis作高可用,由於redis cluster還不是特別成熟,就選擇了redis sentinel做高可用。redis本身有replication,實現主從備份。結合sentinel可以做主、從自動切換。
生產環境中,一般要求有3個redis節點。但本文為了試驗方便,只用了兩個節點,一主一從。

部署規劃

172.16.203.10 主節點
172.16.203.4 從節點
redis版本為3.0.1

主節點

redis採用原始碼編譯的方式安裝,非常簡單,解壓出來,進入解壓目錄,執行make就可以了,這裡就不再詳細介紹了。
下面來看redis.conf

需要做的修改。

daemonize yes                         #讓redis後臺執行
pidfile /apps/run/redis/redis.pid     #指定redis的pid檔案存放位置
port 6379                             #redis使用埠                                           
logfile "/apps/logs/redis/redis.log"  #log檔案的位置。如果為空,則預設列印到/dev/null
requirepass 123456                    #redis的密碼,如果不需要密碼驗證,則可以不做修改
masterauth 123456 #如果上面設定了redis的密碼,則這裡必須設定,而且要和他一樣。當該節點作為從節點連線主節點時,要用到這個密碼和主節點做校驗。

啟動redis:
src/redis-server redis.conf
檢視當前主從狀態:
src/redis-cli -h 172.16.203.10 -a 123456 info Replication

# Replication
role:master
connected_slaves:0
master_repl_offset:544693
repl_backlog_active:
1 repl_backlog_size:1048576 repl_backlog_first_byte_offset:2 repl_backlog_histlen:544692

可以看到,172.16.203.10為master,當前沒有slave。
接下來,就該配置sentinel.conf了:

port 26379                #sentinel使用的埠
daemonize yes             #sentinel後臺執行。這行配置是新增的
logfile "/apps/logs/redis/sentinel.log"  #log檔案地址,這行配置是新增的
sentinel monitor mymaster 172.16.203.10 6379 1   #指定master。後面的數字表示,當有幾個節點認為主節點down時才認為主節點進入ODOWN狀態,就是真正掛了。
sentinel down-after-milliseconds mymaster 5000   #當多久,連線不上節點時,認為被連線節點進入S_DOWN(主觀認為它down了);
sentinel failover-timeout mymaster 15000         #這個配置有很多作用。1、重新執行failover的時間是該值的2倍;2、取消一個沒更改配置的failover3、failover中等待所有slave更改新的配置的最大時間。
sentinel auth-pass mymaster 123456              #設定校驗的密碼。如果redis設定了密碼,這個一定要設定

要修改的就是上面幾項,一定要特別注意sentinel auth-pass這一項,別忘記改 。修改好後,先拷貝一個備份。因為執行過程中,redis會自動修改這個配置。如果之後出了問題,可以通過備份恢復成最開始正確的狀態。
啟動sentinel
src/redis-sentine sentinel.conf
檢視sentinel log:

                _._
           _.-``__ ''-._
      _.-``    `.  `_.  ''-._           Redis 3.0.1 (00000000/0) 64 bit
  .-`` .-```.  ```\/    _.,_ ''-._
 (    '      ,       .-`  | `,    )     Running in sentinel mode
 |`-._`-...-` __...-.``-._|'` _.-'|     Port: 26379
 |    `-._   `._    /     _.-'    |     PID: 19957
  `-._    `-._  `-./  _.-'    _.-'
 |`-._`-._    `-.__.-'    _.-'_.-'|
 |    `-._`-._        _.-'_.-'    |           http://redis.io
  `-._    `-._`-.__.-'_.-'    _.-'
 |`-._`-._    `-.__.-'    _.-'_.-'|
 |    `-._`-._        _.-'_.-'    |
  `-._    `-._`-.__.-'_.-'    _.-'
      `-._    `-.__.-'    _.-'
          `-._        _.-'
              `-.__.-'

19957:X 12 Dec 13:13:36.746 # Sentinel runid is 6ab6f8abdc3dba4097da202954ecece7bc6d3215
19957:X 12 Dec 13:13:36.746 # +monitor master mymaster 172.16.203.10 6379 quorum 1

第一行表示當前Sentinel 的id,第二行顯示當前的主節點是172.16.203.10 6379
檢視下午Sentinel的狀態:
src/redis-cli -h 172.16.203.10 -a 123456 -p 26379 info Sentinel

# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
master0:name=mymaster,status=ok,address=172.16.203.10:6379,slaves=0,sentinels=1

從節點

redis.conf的配置與主節點只有一點不同,增加下面一行:
slaveof 172.16.203.10 6379
啟動redis
src/redis-server redis.conf
從節點檢視主、從狀態:
src/redis-cli -h 172.16.203.4 -a 123456 info Replication

# Replication
role:slave
master_host:172.16.203.10
master_port:6379
master_link_status:up
master_last_io_seconds_ago:0
master_sync_in_progress:0
slave_repl_offset:617956
slave_priority:100
slave_read_only:1
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0

可以看到當前節點為slave。
sentinel的配置和主節點保持一致就可以,啟動sentinel:
src/redis-sentine sentinel.conf
檢視sentinel log:

12190:X 12 Dec 13:21:38.658 # Sentinel runid is 270f322d0f3f8605b92902417e499cedc8866163
12190:X 12 Dec 13:21:38.658 # +monitor master mymaster 172.16.203.10 6379 quorum 1
12190:X 12 Dec 13:21:38.659 * +slave slave 172.16.203.4:6379 172.16.203.4 6379 @ mymaster 172.16.203.10 6379
12190:X 12 Dec 13:21:39.609 * +sentinel sentinel 172.16.203.10:26379 172.16.203.10 26379 @ mymaster 172.16.203.10 6379

檢視下sentinel狀態:
src/redis-cli -h 172.16.203.4 -a 123456 -p 26379 info Sentinel

# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
master0:name=mymaster,status=ok,address=172.16.203.10:6379,slaves=1,sentinels=2

可以看出,當前有兩個sentinels,一個slave。
到此,redis主從高可用就算配置結束了,下面開始驗證

驗證

1、從節點down機,redis、sentinel都掛了,關注主節點sentinel的log
+sdown sentinel 172.16.203.4:26379 172.16.203.4 26379 @ mymaster 172.16.203.10 6379
+sdown slave 172.16.203.4:6379 172.16.203.4 6379 @ mymaster 172.16.203.10 6379
2、重新啟動從節點上的redis、sentinel
-sdown slave 172.16.203.4:6379 172.16.203.4 6379 @ mymaster 172.16.203.10 6379
-sdown sentinel 172.16.203.4:26379 172.16.203.4 26379 @ mymaster 172.16.203.10 6379
-dup-sentinel master mymaster 172.16.203.10 6379 #duplicate of 172.16.203.4:26379 or 0b0bf0cddcf7aa5b518a8a62c65188f9c4a1ecaf
+sentinel sentinel 172.16.203.4:26379 172.16.203.4 26379 @ mymaster 172.16.203.10 6379
可以看到,sentinel 的id變了,自動更新了sentinel配置檔案中的相應配置。
檢視主、從情況:
src/redis-cli -h 172.16.203.10 -a 123456 -p 6379 info Replication

# Replication
role:master
connected_slaves:1
slave0:ip=172.16.203.4,port=6379,state=online,offset=862487,lag=1
master_repl_offset:862642
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:2
repl_backlog_histlen:862641

3、主節點down機
先停掉redis看主節點sentinel:

19957:X 12 Dec 15:10:19.207 # +sdown master mymaster 172.16.203.10 6379
19957:X 12 Dec 15:10:19.207 # +odown master mymaster 172.16.203.10 6379 #quorum 1/1
19957:X 12 Dec 15:10:19.207 # +new-epoch 1
19957:X 12 Dec 15:10:19.207 # +try-failover master mymaster 172.16.203.10 6379
19957:X 12 Dec 15:10:19.208 # +vote-for-leader 6ab6f8abdc3dba4097da202954ecece7bc6d3215 1
19957:X 12 Dec 15:10:19.211 # 172.16.203.4:26379 voted for 6ab6f8abdc3dba4097da202954ecece7bc6d3215 1
19957:X 12 Dec 15:10:19.275 # +elected-leader master mymaster 172.16.203.10 6379
19957:X 12 Dec 15:10:19.275 # +failover-state-select-slave master mymaster 172.16.203.10 6379
19957:X 12 Dec 15:10:19.375 # +selected-slave slave 172.16.203.4:6379 172.16.203.4 6379 @ mymaster 172.16.203.10 6379
19957:X 12 Dec 15:10:19.375 * +failover-state-send-slaveof-noone slave 172.16.203.4:6379 172.16.203.4 6379 @ mymaster 172.16.203.10 6379
19957:X 12 Dec 15:10:19.447 * +failover-state-wait-promotion slave 172.16.203.4:6379 172.16.203.4 6379 @ mymaster 172.16.203.10 6379
19957:X 12 Dec 15:10:20.216 # +promoted-slave slave 172.16.203.4:6379 172.16.203.4 6379 @ mymaster 172.16.203.10 6379
19957:X 12 Dec 15:10:20.216 # +failover-state-reconf-slaves master mymaster 172.16.203.10 6379
19957:X 12 Dec 15:10:20.297 # +failover-end master mymaster 172.16.203.10 6379
19957:X 12 Dec 15:10:20.297 # +switch-master mymaster 172.16.203.10 6379 172.16.203.4 6379
19957:X 12 Dec 15:10:20.298 * +slave slave 172.16.203.10:6379 172.16.203.10 6379 @ mymaster 172.16.203.4 6379
19957:X 12 Dec 15:10:25.350 # +sdown slave 172.16.203.10:6379 172.16.203.10 6379 @ mymaster 172.16.203.4 6379

redis主節點掛了後,首先重新選擇leader(注意區分leader和master,leader對應sentinel,master對應redis),可以看到,leader選擇為172.16.203.10,之後他開始選擇master:
failover-state-select-slave
下面表示找到了合適的slave:172.16.203.4 6379
selected-slave 172.16.203.4 6379
然後更改選中的這個節點的配置檔案
failover-state-send-slaveof-noone
等待其他sentinel的確認:
failover-state-wait-promotion
確認成功:
promoted-slave
開始對slaves進行reconfig操作。
failover-state-reconf-slaves
failover結束
failover-end
監聽新的master
switch-master

看看從節點的sentinel日誌:

24199:X 12 Dec 15:10:19.210 # +vote-for-leader 6ab6f8abdc3dba4097da202954ecece7bc6d3215 1
24199:X 12 Dec 15:10:19.249 # +sdown master mymaster 172.16.203.10 6379
24199:X 12 Dec 15:10:19.249 # +odown master mymaster 172.16.203.10 6379 #quorum 1/1
24199:X 12 Dec 15:10:19.249 # Next failover delay: I will not start a failover before Sat Dec 12 15:10:50 2015
24199:X 12 Dec 15:10:20.299 # +config-update-from sentinel 172.16.203.10:26379 172.16.203.10 26379 @ mymaster 172.16.203.10 6379
24199:X 12 Dec 15:10:20.299 # +switch-master mymaster 172.16.203.10 6379 172.16.203.4 6379
24199:X 12 Dec 15:10:20.299 * +slave slave 172.16.203.10:6379 172.16.203.10 6379 @ mymaster 172.16.203.4 6379
24199:X 12 Dec 15:10:25.315 # +sdown slave 172.16.203.10:6379 172.16.203.10 6379 @ mymaster 172.16.203.4 6379

再停掉master的sentinel
+sdown sentinel 172.16.203.10:26379 172.16.203.10 26379 @ mymaster 172.16.203.4 6379

問題

1、停掉一個sentinel,然後再停掉master,sentinel一直這個狀態:

18430:X 12 Dec 11:36:37.949 # +new-epoch 68
18430:X 12 Dec 11:36:37.949 # +try-failover master mymaster 127.0.0.1 6380
18430:X 12 Dec 11:36:39.179 # +vote-for-leader 1c9ea5336e95283251d9e53dccf8f6dedd51536d 68
18430:X 12 Dec 11:36:48.077 # -failover-abort-not-elected master mymaster 127.0.0.1 6380
18430:X 12 Dec 11:36:48.177 # Next failover delay: I will not start a failover before Sat Dec 12 11:42:38 2015
18430:X 12 Dec 11:42:38.057 # +new-epoch 69
18430:X 12 Dec 11:42:38.057 # +try-failover master mymaster 127.0.0.1 6380
18430:X 12 Dec 11:42:38.106 # +vote-for-leader 1c9ea5336e95283251d9e53dccf8f6dedd51536d 69
18430:X 12 Dec 11:42:48.443 # -failover-abort-not-elected master mymaster 127.0.0.1 6380
18430:X 12 Dec 11:42:48.544 # Next failover delay: I will not start a failover before Sat Dec 12 11:48:38 2015

這裡要提下sentinel的leader選舉流程:每個發現主伺服器進入客觀下線的sentinel,在傳送is-master-down-by-addr詢問的時候,
會帶上自己的run id,要求其他sentinel將自己設定為區域性領頭sentinel。區域性領頭sentinel是先到先得:只有第一個傳送is-master-down-by-addr詢問的sentinel被設為區域性領頭sentinel,後續的都會被拒絕。如果有某個sentinel被**半數以上**sentinel設定區域性領頭sentinel,則這個sentinel成為領頭sentinel。
注意半數以上 ,雖然我們停掉了一個sentinel,但由於配置檔案紀錄了他,所以sentinel數量還是2。半數以上也就是2,但實際我們只有一個sentinel,因此永遠也選不出leader,也就不會進行failover。