Redis Sentinel--安裝配置
我們知道Redis類似MySQL數據庫自帶主從復制結構,產品環境中,如果一旦master發生crash,我們希望slave可以立即自動提升為主,接替業務提供服務,如何實現這個功能呢?redis sentinel集群可以幫助我們實現這個功能;
Redis Sentinel是Redis官方原生高可用解決方案,Redis Sentinel部署架構主要包括兩部分:Redis Sentinel集群和Redis master-slave集群,其中Redis Sentinel集群是由若幹Sentinel節點組成的分布式集群;
可以實現故障發現、故障轉移、配置中心和客戶端通知。Redis Sentinel的節點數量要滿足2n+1(n>=1)的奇數個(官方建議至少3個)。
Redis Sentinel特點
(1)master與slave之間的failover是通過sentinel來監控,如果共有5個sentinel,配置參數中設置只要有2個sentinel認為master crash了,就會進行failover,但是進行failover的那個sentinel必須先獲得至少3個sentinel的授權才能實行failover;
(2)sentinel集群不會同一時間多個sentinel並發執行failover,如果第一個進行failover的sentinel失敗了,另外一個將會在一定時間內重新進行failover,以此類推;
(3)當failover後,sentinel會獲得master的最新的一個配置版本號,然後在廣播給其他sentinel,所以一個能夠互相通信的sentinel集群最終會采用版本號最高且相同的配置;
(4)Redis Sentinel version1開始於Redis2.6, Redis Sentinel version 2 開始於Redis 2.8,建議使用Sentinel 2 ;
(5)Redis-Sentinel是Redis官方推薦的高可用性(HA) 解決方案,Redis-sentinel本身也是一個獨立運行的進程,它能監控多個master-slave集群,發現master宕機後能進行自動切換。Sentinel可以監視任意多個主服務器(復用),以及主服務器屬下的從服務器,並在被監視的主服務器下線時,自動執行故障轉移操作。
SDOWN和ODOWN
SDOWN(主觀宕機)是sentinel自己主觀檢測到master的狀態是down;
ODOWN(客觀宕機)需要大多數的sentinel都認為master宕機;
從SDOWN切換到ODOWN不需要任何一致性算法,只需要一個gossip協議,如果一個sentinel收到了足夠多的sentinel發來消息告訴它某個master已經down掉了,SDOWN狀態就會變成ODOWN狀態。如果之後master可用了,這個狀態就會相應地被清理掉。
Sentinel.conf相關參數
port 26379
#sentinel的端口號
sentinel monitor mymaster 127.0.0.1 6379 2
#sentinel監控的master名稱默認是mymaster, 最後數字2表示如果有兩個sentinel認為master掛了,則這個master即認為不可用;
註意:我們可以通過配置不同的master名稱,讓一套Sentinel Cluster監控多個Redis master-slave集群;
sentinel down-after-milliseconds mymaster 30000
# 默認30秒,sentinel會通過ping來判斷master是否存活,如果在30秒內master返回pong給sentinel,則認為master是好的,否則sentinel認為master不可用;
sentinel parallel-syncs mymaster 1
#當Sentinel節點集合對主節點故障判定達成一致時,Sentinel領導者節點會做故障轉移操作,選出新的主節點,原來的從節點會向新的主節點發起復制操作,限制每次向新的主節點發起復制操作的從節點個數為1
sentinel failover-timeout mymaster 180000
# 故障轉移超時時間為3min
Redis Sentinel中的身份驗證
當一個master配置為需要密碼才能連接時,客戶端和slave在連接時都需要提供密碼;
master通過requirepass設置自身的密碼,不提供密碼無法連接到這個master;
slave通過masterauth來設置訪問master時的密碼;
但是當使用了sentinel時,由於一個master可能會變成一個slave,一個slave也可能會變成master,所以需要同時設置上述兩個配置項。
安裝Redis Sentinel
(1)Redis sentinel架構圖和節點環境
Roel | Host | IP | Port |
Sentinel1 | sht-sgmhadoopnn-01 | 172.16.101.54 | 26379 |
Sentinel2 | sht-sgmhadoopnn-01 | 172.16.101.55 | 26379 |
Sentinel3 | sht-sgmhadoopnn-02 | 172.16.101.56 | 26379 |
Master | sht-sgmhadoopdn-01 | 172.16.101.58 | 6379 |
Slave1 | sht-sgmhadoopdn-02 | 172.16.101.59 | 6379 |
Slave2 | sht-sgmhadoopdn-03 | 172.16.101.60 | 6379 |
(2)配置Redis主從復制
[root@sht-sgmhadoopdn-01 redis]# vim redis.conf bind 172.16.101.58 [root@sht-sgmhadoopdn-01 redis]# src/redis-server redis.conf [root@sht-sgmhadoopdn-02 redis]# vim redis.conf bind 172.16.101.59 slaveof 172.16.101.58 6379 [root@sht-sgmhadoopdn-02 redis]# src/redis-server redis.conf [root@sht-sgmhadoopdn-03 redis]# vim redis.conf bind 172.16.101.60 slaveof 172.16.101.58 6379 [root@sht-sgmhadoopdn-03 redis]# src/redis-server redis.conf
檢查主從復制的設置
[root@sht-sgmhadoopdn-01 redis]# src/redis-cli -h 172.16.101.58 172.16.101.58:6379> client list id=3 addr=172.16.101.59:35718 fd=7 name= age=26 idle=0 flags=S db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=replconf id=4 addr=172.16.101.60:33986 fd=8 name= age=22 idle=0 flags=S db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=replconf id=5 addr=172.16.101.58:38875 fd=9 name= age=4 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=client
172.16.101.58:6379> info replication # Replication role:master connected_slaves:2 slave0:ip=172.16.101.59,port=6379,state=online,offset=57,lag=0 slave1:ip=172.16.101.60,port=6379,state=online,offset=57,lag=0 master_repl_offset:57 repl_backlog_active:1 repl_backlog_size:1048576 repl_backlog_first_byte_offset:2 repl_backlog_histlen:56
(3)配置sentinel集群
三個sentinel節點的sentinel.conf文件配置一樣,如果是在同一個主機上,則需要使用不同的端口號
[root@sht-sgmhadoopcm-01 redis]# vim sentinel.conf port 26379 daemonize yes protected-mode no logfile "sentinel.log" dir /usr/local/redis sentinel monitor mymaster 172.16.101.58 6379 2 sentinel down-after-milliseconds mymaster 30000 sentinel parallel-syncs mymaster 1
sentinel節點有兩種啟動方法:
src/redis-sentinel sentinel.conf src/redis-server sentinel.conf --sentinel [root@sht-sgmhadoopcm-01 redis]# src/redis-sentinel sentinel.conf [root@sht-sgmhadoopnn-01 redis]# src/redis-sentinel sentinel.conf [root@sht-sgmhadoopnn-02 redis]# src/redis-sentinel sentinel.conf [root@sht-sgmhadoopcm-01 redis]# ps -ef|grep redis|grep -v grep root 7541 1 0 22:33 ? 00:00:00 src/redis-sentinel *:26379 [sentinel]
(4)檢查整個集群的狀態
[root@sht-sgmhadoopcm-01 redis]# src/redis-cli -h 172.16.101.54 -p 26379 172.16.101.54:26379> client list id=3 addr=172.16.101.55:43182 fd=13 name=sentinel-ab45fe6c-cmd age=138 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=4 addr=172.16.101.56:60016 fd=15 name=sentinel-e32f20c0-cmd age=136 idle=1 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=5 addr=172.16.101.54:35342 fd=17 name= age=26 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=client 172.16.101.54:26379> info sentinel # Sentinel sentinel_masters:1 sentinel_tilt:0 sentinel_running_scripts:0 sentinel_scripts_queue_length:0 sentinel_simulate_failure_flags:0 master0:name=mymaster,status=ok,address=172.16.101.58:6379,slaves=2,sentinels=3 [root@sht-sgmhadoopdn-01 redis]# src/redis-cli -h 172.16.101.58 -p 6379 172.16.101.58:6379> client list id=16 addr=172.16.101.54:56510 fd=10 name=sentinel-30393e76-pubsub age=326 idle=0 flags=N db=0 sub=1 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=subscribe id=17 addr=172.16.101.54:56508 fd=11 name=sentinel-30393e76-cmd age=326 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=18 addr=172.16.101.55:57444 fd=12 name=sentinel-ab45fe6c-cmd age=177 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=publish id=19 addr=172.16.101.55:57446 fd=13 name=sentinel-ab45fe6c-pubsub age=177 idle=0 flags=N db=0 sub=1 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=subscribe id=3 addr=172.16.101.59:35718 fd=7 name= age=3936 idle=0 flags=S db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=replconf id=4 addr=172.16.101.60:33986 fd=8 name= age=3932 idle=0 flags=S db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=replconf id=20 addr=172.16.101.56:55648 fd=14 name=sentinel-e32f20c0-cmd age=173 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=21 addr=172.16.101.56:55650 fd=15 name=sentinel-e32f20c0-pubsub age=173 idle=0 flags=N db=0 sub=1 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=subscribe id=5 addr=172.16.101.58:38875 fd=9 name= age=3914 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=client
當我們啟動主從節點和sentinel節點後,sentinel.conf配置文件會自動添加或修改參數
[root@sht-sgmhadoopcm-01 redis]# cat sentinel.conf sentinel myid 30393e76e002cb64db92fb8bcb88d79f2d85a82b sentinel config-epoch mymaster 0 sentinel leader-epoch mymaster 0 # Generated by CONFIG REWRITE sentinel known-slave mymaster 172.16.101.60 6379 sentinel known-slave mymaster 172.16.101.59 6379 sentinel known-sentinel mymaster 172.16.101.55 26379 ab45fe6c0f010473ce3b7b4d2120e1a83776b736 sentinel known-sentinel mymaster 172.16.101.56 26379 e32f20c0f315e712c9921371f15729246f3816a0 sentinel current-epoch 0 [root@sht-sgmhadoopnn-01 redis]# cat sentinel.conf sentinel myid ab45fe6c0f010473ce3b7b4d2120e1a83776b736 sentinel config-epoch mymaster 0 sentinel leader-epoch mymaster 0 # Generated by CONFIG REWRITE sentinel known-slave mymaster 172.16.101.60 6379 sentinel known-slave mymaster 172.16.101.59 6379 sentinel known-sentinel mymaster 172.16.101.56 26379 e32f20c0f315e712c9921371f15729246f3816a0 sentinel known-sentinel mymaster 172.16.101.54 26379 30393e76e002cb64db92fb8bcb88d79f2d85a82b sentinel current-epoch 0 [root@sht-sgmhadoopnn-02 redis]# cat sentinel.conf sentinel myid e32f20c0f315e712c9921371f15729246f3816a0 sentinel config-epoch mymaster 0 sentinel leader-epoch mymaster 0 # Generated by CONFIG REWRITE sentinel known-slave mymaster 172.16.101.60 6379 sentinel known-slave mymaster 172.16.101.59 6379 sentinel known-sentinel mymaster 172.16.101.54 26379 30393e76e002cb64db92fb8bcb88d79f2d85a82b sentinel known-sentinel mymaster 172.16.101.55 26379 ab45fe6c0f010473ce3b7b4d2120e1a83776b736 sentinel current-epoch 0
測試自動failover
[root@sht-sgmhadoopdn-01 redis]# ps -ef|grep redis root 15128 1 0 21:17 ? 00:00:05 src/redis-server 172.16.101.58:6379 [root@sht-sgmhadoopdn-01 redis]# kill -9 15128 [root@sht-sgmhadoopcm-01 redis]# tail -f sentinel.log 7541:X 05 Aug 22:55:48.052 # +sdown master mymaster 172.16.101.58 6379 #sentinel主觀認為master crash; 7541:X 05 Aug 22:55:48.143 # +odown master mymaster 172.16.101.58 6379 #quorum 2/2 #只要有兩個sentinel節點認為master crash,則客觀認為master crash 7541:X 05 Aug 22:55:48.143 # +new-epoch 1 7541:X 05 Aug 22:55:48.143 # +try-failover master mymaster 172.16.101.58 6379 7541:X 05 Aug 22:55:48.165 # +vote-for-leader 30393e76e002cb64db92fb8bcb88d79f2d85a82b 1 7541:X 05 Aug 22:55:48.166 # ab45fe6c0f010473ce3b7b4d2120e1a83776b736 voted for ab45fe6c0f010473ce3b7b4d2120e1a83776b736 1 7541:X 05 Aug 22:55:48.173 # e32f20c0f315e712c9921371f15729246f3816a0 voted for ab45fe6c0f010473ce3b7b4d2120e1a83776b736 1 7541:X 05 Aug 22:55:48.544 # +config-update-from sentinel ab45fe6c0f010473ce3b7b4d2120e1a83776b736 172.16.101.55 26379 @ mymaster 172.16.101.58 6379 7541:X 05 Aug 22:55:48.544 # +switch-master mymaster 172.16.101.58 6379 172.16.101.60 6379 7541:X 05 Aug 22:55:48.545 * +slave slave 172.16.101.59:6379 172.16.101.59 6379 @ mymaster 172.16.101.60 6379 7541:X 05 Aug 22:55:48.545 * +slave slave 172.16.101.58:6379 172.16.101.58 6379 @ mymaster 172.16.101.60 6379 #從這一步到下一步執行failover成功之間需要等待30s,這是由於參數sentinel down-after-milliseconds mymaster控制,master 30s之內沒有響應sentinel才會真正的failover; 7541:X 05 Aug 22:56:18.562 # +sdown slave 172.16.101.58:6379 172.16.101.58 6379 @ mymaster 172.16.101.60 6379
master和slave發生了變化,IP60成為新的master,IP58成為slave
[root@sht-sgmhadoopcm-01 redis]# src/redis-cli -h 172.16.101.54 -p 26379 172.16.101.54:26379> sentinel masters 1) 1) "name" 2) "mymaster" 3) "ip" 4) "172.16.101.60" ...... 172.16.101.54:26379> sentinel slaves mymaster 1) 1) "name" 2) "172.16.101.58:6379" 3) "ip" 4) "172.16.101.58" 9) "flags" 10) "s_down,slave,disconnected" 2) 1) "name" 2) "172.16.101.59:6379" 3) "ip" 4) "172.16.101.59" 9) "flags" 10) "slave"
重啟修復好的舊master之後,會自動成為新master的從庫
[root@sht-sgmhadoopdn-01 redis]# src/redis-server redis.conf [root@sht-sgmhadoopcm-01 redis]# tail -f sentinel.log 7541:X 05 Aug 23:11:10.556 # -sdown slave 172.16.101.58:6379 172.16.101.58 6379 @ mymaster 172.16.101.60 6379 7541:X 05 Aug 23:11:20.518 * +convert-to-slave slave 172.16.101.58:6379 172.16.101.58 6379 @ mymaster 172.16.101.60 6379
總結:
Failover過程分析:
Each Sentinel detects the master is down with an +sdown event.
This event is later escalated to +odown, which means that multiple Sentinels agree about the fact the master is not reachable.
Sentinels vote a Sentinel that will start the first failover attempt.
The failover happens.
sentinel節點會定期通過ping檢測redis的master是否存活,一旦master crash,
首先sentinel自己會主觀認為master crash,然後三個sentinel之間彼此通信,只要有兩個sentinel節點認為master crash,則客觀認為master crash,
接著三個sentinel節點會投票,得到兩票的一個sentinel會去執行failover,
最後master 30s之內沒有響應sentinel才會真正的failover;
一旦掛掉的舊master修復,重新啟動後,會作為新master的從庫存在;
FAQ
Error1:
[root@sht-sgmhadoopcm-01 redis]# src/redis-cli -h 172.16.101.54 -p 26379 172.16.101.54:26379> ping (error) DENIED Redis is running in protected mode because protected mode is enabled, no bind address was specified, no authentication password is requested to clients. In this mode connections are only accepted from the loopback interface. If you want to connect from external computers to Redis you may adopt one of the following solutions: 1) Just disable protected mode sending the command 'CONFIG SET protected-mode no' from the loopback interface by connecting to Redis from the same host the server is running, however MAKE SURE Redis is not publicly accessible from internet if you do so. Use CONFIG REWRITE to make this change permanent. 2) Alternatively you can just disable the protected mode by editing the Redis configuration file, and setting the protected mode option to 'no', and then restarting the server. 3) If you started the server manually just for testing, restart it with the '--protected-mode no' option. 4) Setup a bind address or an authentication password. NOTE: You only need to do one of the above things in order for the server to start accepting connections from the outside. 解決方法: [root@sht-sgmhadoopcm-01 redis]# vim sentinel.conf protected-mode no
參考鏈接:Redis Sentinel Documentation https://redis.io/topics/sentinel
Redis Sentinel--安裝配置