redis叢集故障分析

阿新 • • 發佈：2018-12-25

故障表象：
業務層面顯示提示查詢redis失敗

叢集組成：
3主3從，每個節點的資料有8GB

機器分佈：
在同一個機架中，
xx.x.xxx.199
xx.x.xxx.200
xx.x.xxx.201

redis-server程序狀態：
通過命令ps -eo pid,lstart | grep $pid，
發現程序已經持續運行了3個月

發生故障前叢集的節點狀態：
xx.x.xxx.200:8371(bedab2c537fe94f8c0363ac4ae97d56832316e65) master
xx.x.xxx.199:8373(792020fe66c00ae56e27cd7a048ba6bb2b67adb6) slave

xx.x.xxx.201:8375(5ab4f85306da6d633e4834b4d3327f45af02171b) master
xx.x.xxx.201:8372(826607654f5ec81c3756a4a21f357e644efe605a) slave

xx.x.xxx.199:8370(462cadcb41e635d460425430d318f2fe464665c5) master
xx.x.xxx.200:8374(1238085b578390f3c8efa30824fd9a4baba10ddf) slave

---------------------------------下面是日誌分析--------------------------------------

步1：
主節點8371失去和從節點8373的連線：
46590:M 09 Sep 18:57:51.379 # Connection with slave xx.x.xxx.199:8373 lost.

步2：
主節點8370/8375判定8371失聯：
42645:M 09 Sep 18:57:50.117 * Marking node bedab2c537fe94f8c0363ac4ae97d56832316e65 as failing (quorum reached).

步3：
從節點8372/8373/8374收到主節點8375說8371失聯：
46986:S 09 Sep 18:57:50.120 * FAIL message received from 5ab4f85306da6d633e4834b4d3327f45af02171b about bedab2c537fe94f8c0363ac4ae97d56832316e65

步4：

主節點8370/8375授權8373升級為主節點轉移：
42645:M 09 Sep 18:57:51.055 # Failover auth granted to 792020fe66c00ae56e27cd7a048ba6bb2b67adb6 for epoch 16

步5：
原主節點8371修改自己的配置，成為8373的從節點：
46590:M 09 Sep 18:57:51.488 # Configuration change detected. Reconfiguring myself as a replica of 792020fe66c00ae56e27cd7a048ba6bb2b67adb6

步6：
主節點8370/8375/8373明確8371失敗狀態：
42645:M 09 Sep 18:57:51.522 * Clear FAIL state for node bedab2c537fe94f8c0363ac4ae97d56832316e65: master without slots is reachable again.

步7：

新從節點8371開始從新主節點8373，第一次全量同步資料：
8373日誌：：
4255:M 09 Sep 18:57:51.906 * Full resync requested by slave xx.x.xxx.200:8371
4255:M 09 Sep 18:57:51.906 * Starting BGSAVE for SYNC with target: disk
4255:M 09 Sep 18:57:51.941 * Background saving started by pid 5230

8371日誌：：
46590:S 09 Sep 18:57:51.948 * Full resync from master: d7751c4ebf1e63d3baebea1ed409e0e7243a4423:440721826993

步8：
主節點8370/8375判定8373(新主)失聯：
42645:M 09 Sep 18:58:00.320 * Marking node 792020fe66c00ae56e27cd7a048ba6bb2b67adb6 as failing (quorum reached).

步9：
主節點8370/8375判定8373(新主)恢復：
60295:M 09 Sep 18:58:18.181 * Clear FAIL state for node 792020fe66c00ae56e27cd7a048ba6bb2b67adb6: is reachable again and nobody is serving its slots after some time.

步10：
主節點8373完成全量同步所需要的BGSAVE操作：
5230:C 09 Sep 18:59:01.474 * DB saved on disk
5230:C 09 Sep 18:59:01.491 * RDB: 7112 MB of memory used by copy-on-write
4255:M 09 Sep 18:59:01.877 * Background saving terminated with success

步11：
從節點8371開始從主節點8373接收到資料：
46590:S 09 Sep 18:59:02.263 * MASTER <-> SLAVE sync: receiving 2657606930 bytes from master

步12：
主節點8373發現從節點8371對output buffer作了限制：
4255:M 09 Sep 19:00:19.014 # Client id=14259015 addr=xx.x.xxx.200:21772 fd=844 name= age=148 idle=148 flags=S db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=16349 oll=4103 omem=95944066 events=rw cmd=psync scheduled to be closed ASAP for overcoming of output buffer limits.
4255:M 09 Sep 19:00:19.015 # Connection with slave xx.x.xxx.200:8371 lost.

步13：
從節點8371從主節點8373同步資料失敗，連線斷了，第一次全量同步失敗：
46590:S 09 Sep 19:00:19.018 # I/O error trying to sync with MASTER: connection lost
46590:S 09 Sep 19:00:20.102 * Connecting to MASTER xx.x.xxx.199:8373
46590:S 09 Sep 19:00:20.102 * MASTER <-> SLAVE sync started

步14：
從節點8371重新開始同步，連線失敗，主節點8373的連線數滿了：
46590:S 09 Sep 19:00:21.103 * Connecting to MASTER xx.x.xxx.199:8373
46590:S 09 Sep 19:00:21.103 * MASTER <-> SLAVE sync started
46590:S 09 Sep 19:00:21.104 * Non blocking connect for SYNC fired the event.
46590:S 09 Sep 19:00:21.104 # Error reply to PING from master: '-ERR max number of clients reached'

步15：
從節點8371重新連上主節點8373，第二次開始全量同步：
8371日誌：
46590:S 09 Sep 19:00:49.175 * Connecting to MASTER xx.x.xxx.199:8373
46590:S 09 Sep 19:00:49.175 * MASTER <-> SLAVE sync started
46590:S 09 Sep 19:00:49.175 * Non blocking connect for SYNC fired the event.
46590:S 09 Sep 19:00:49.176 * Master replied to PING, replication can continue...
46590:S 09 Sep 19:00:49.179 * Partial resynchronization not possible (no cached master)
46590:S 09 Sep 19:00:49.501 * Full resync from master: d7751c4ebf1e63d3baebea1ed409e0e7243a4423:440780763454

8373日誌：
4255:M 09 Sep 19:00:49.176 * Slave xx.x.xxx.200:8371 asks for synchronization
4255:M 09 Sep 19:00:49.176 * Full resync requested by slave xx.x.xxx.200:8371
4255:M 09 Sep 19:00:49.176 * Starting BGSAVE for SYNC with target: disk
4255:M 09 Sep 19:00:49.498 * Background saving started by pid 18413
18413:C 09 Sep 19:01:52.466 * DB saved on disk
18413:C 09 Sep 19:01:52.620 * RDB: 2124 MB of memory used by copy-on-write
4255:M 09 Sep 19:01:53.186 * Background saving terminated with success

步16：
從節點8371同步資料成功，開始載入經記憶體：
46590:S 09 Sep 19:01:53.190 * MASTER <-> SLAVE sync: receiving 2637183250 bytes from master
46590:S 09 Sep 19:04:51.485 * MASTER <-> SLAVE sync: Flushing old data
46590:S 09 Sep 19:05:58.695 * MASTER <-> SLAVE sync: Loading DB in memory

步17：
叢集恢復正常：
42645:M 09 Sep 19:05:58.786 * Clear FAIL state for node bedab2c537fe94f8c0363ac4ae97d56832316e65: slave is reachable again.

步18：
從節點8371同步資料成功，耗時7分鐘：
46590:S 09 Sep 19:08:19.303 * MASTER <-> SLAVE sync: Finished with success

8371失聯原因分析：
由於幾臺機器在同一個機架，不太可能發生網路中斷的情況，
於是通過SLOWLOG GET命令查看了慢查詢日誌，發現有一個KEYS命令被執行了，耗時8.3秒，
再檢視叢集節點超時設定，發現是5s(cluster-node-timeout 5000)
出現節點失聯的原因：
客戶端執行了耗時1條8.3s的命令，
2016/9/9 18:57:43 開始執行KEYS命令
2016/9/9 18:57:50 8371被判斷失聯（redis日誌）
2016/9/9 18:57:51 執行完KEYS命令

總結來說，有以下幾個問題：
1.由於cluster-node-timeout設定比較短，慢查詢KEYS導致了叢集判斷節點8371失聯
2.由於8371失聯，導致8373升級為主，開始主從同步
3.由於配置client-output-buffer-limit的限制，導致第一次全量同步失敗了
4.又由於PHP客戶端的連線池有問題，瘋狂連線伺服器，產生了類似SYN攻擊的效果
5.第一次全量同步失敗後，從節點重連主節點花了30秒（超過了最大連線數1w）

關於client-output-buffer-limit引數：

# The syntax of every client-output-buffer-limit directive is the following:
#
# client-output-buffer-limit <class> <hard limit> <soft limit> <soft seconds>
#
# A client is immediately disconnected once the hard limit is reached, or if
# the soft limit is reached and remains reached for the specified number of
# seconds (continuously).
# So for instance if the hard limit is 32 megabytes and the soft limit is
# 16 megabytes / 10 seconds, the client will get disconnected immediately
# if the size of the output buffers reach 32 megabytes, but will also get
# disconnected if the client reaches 16 megabytes and continuously overcomes
# the limit for 10 seconds.
#
# By default normal clients are not limited because they don't receive data
# without asking (in a push way), but just after a request, so only
# asynchronous clients may create a scenario where data is requested faster
# than it can read.
#
# Instead there is a default limit for pubsub and slave clients, since
# subscribers and slaves receive data in a push fashion.
#
# Both the hard or the soft limit can be disabled by setting them to zero.
client-output-buffer-limit normal 0 0 0
client-output-buffer-limit slave 256mb 64mb 60
client-output-buffer-limit pubsub 32mb 8mb 60

採取措施：

1.單例項的切割到4G以下，否則發生主從切換會耗時很長

2.調整client-output-buffer-limit引數，防止同步進行到一半失敗

3.調整cluster-node-timeout，不能少於15s

4.禁止任何耗時超過cluster-node-timeout的慢查詢，因為會導致主從切換

5.修復客戶端類似SYN攻擊的瘋狂連線方式

End;

redis叢集故障分析

redis叢集故障分析

詳細分析Redis叢集故障

Redis 叢集故障處理 overcommit_memory is set to 0

Redis叢集--故障轉移

Redis叢集原理分析

Redis原始碼閱讀（六）叢集-故障遷移(下)

0023-HOSTS配置問題導致叢集異常故障分析

Redis 高負載下的中斷優化(基於京東redis叢集高負載下丟包案例分析)

redis叢集使用原因分析總結

redis叢集環境的搭建和錯誤分析

redis叢集之使用浮動vip實現故障切換全過程

redis 叢集效能測試結果分析

CentO7 安裝 redis，主從配置，Sentinel叢集故障轉移切換

ELK 叢集 + Redis 叢集 + Nginx ,分散式的實時日誌（資料）蒐集和分析的監控系統搭建，簡單上手使用

linux系統故障分析與排查

啟動VIP報CRS-1028/CRS-0223致使VIP狀態為UNKNOWN故障分析與解決

Redis 源碼分析 - 復制

Linux系統故障分析與排查--日誌分析

redis源碼分析（3）-- 基本數據結構雙鏈表list

redis源碼分析（4）-- 基本數據結構字典dict

redis叢集故障分析

相關推薦