Redis關鍵點（rehash）

阿新 • • 發佈：2019-01-01

是一種高效的資料結構，被廣泛的用在key-value儲存中，Redis的dict其實就是一個典型的hash table實現。

是在hash table的大小不能滿足需求，造成過多hash碰撞後需要進行的擴容hash table的操作，其實通常的做法確實是建立一個額外的hash table，將原來的hash table中的資料在新的資料中進行重新輸入，從而生成新的hash表。

lazy rehashing：在每次對dict進行操作的時候執行一個slot的rehash
active rehashing：每100ms裡面使用1ms時間進行rehash。

dict實現中主要用到如下結構體,其實就是個典型的鏈式hash。

一個dict會有2個hash table，由dictht結構管理，編號為0和1.

使用是優先使用0號hash table，當空間不足時會呼叫dictExpand來擴充套件hash table，此時準備1號hash table用於增量的rehash使用。rehash完成後把0號釋放，1號儲存到0號。

rehashidx是下一個需要rehash的項在ht[0]中的索引，不需要rehash時置為-1。也就是說-1時，表示不進行rehash。

iterators記錄當前dict中的迭代器數，主要是為了避免在有迭代器時rehash，在有迭代器時rehash可能會造成值的丟失或重複，

dictht中的table是一個數組+指標形式的hash表，size表hash陣列(桶)的大小，used表示hash表的元素個數，這兩個值與rehash、resize過程密切相關。sizemask等於size-1，這是為了方便將hash值對映到陣列中。

typedef struct dictEntry {
void *key;
void *val;
struct dictEntry *next;
} dictEntry;
typedef struct dictht {
dictEntry **table;
unsigned long size;//hash桶的個數
unsigned long sizemask;//hash取模的用到
unsigned long used;//元素個數
} dictht;
typedef struct dict {
dictType *type;
void *privdata;
dictht ht[2];
int rehashidx; /* rehashing not in progress if rehashidx == -1 */
int iterators; /* number of iterators currently running */
} dict;
typedef struct dictIterator {
dict *d;
int table;
int index;
dictEntry *entry, *nextEntry;
} dictIterator;

什麼時候dict做擴容

在資料插入的時候會呼叫dictKeyIndex,該方法裡會呼叫_dictExpandIfNeeded，判斷dict是否需要rehash，當dict中元素大於桶的個數時，呼叫dictExpand擴充套件hash

/* Expand the hash table if needed */
 
static int _dictExpandIfNeeded(dict *d)
 
{
 
/* If the hash table is empty expand it to the intial size,
 
* if the table is “full” dobule its size. */
 
if (dictIsRehashing(d)) return DICT_OK;
 
if (d->ht[0].size == 0)
 
return dictExpand(d, DICT_HT_INITIAL_SIZE);
 
if (d->ht[0].used >= d->ht[0].size && dict_can_resize)
 
return dictExpand(d, ((d->ht[0].size > d->ht[0].used) ?
 
d->ht[0].size : d->ht[0].used)*2);
 
return DICT_OK;
 
}

dictExpand的工作主要是初始化hash表，預設是擴大兩倍(並不單純是桶的兩倍)，然後賦值給ht[1]，然後狀態改為rehashing,此時該dict開始rehashing

擴容過程如何進行

rehash主要在dictRehash中完成。先看下什麼時候進行rehash。

active rehashing ：serverCron中，當沒有後臺子執行緒時，會呼叫incrementallyRehash，最終呼叫dictRehashMilliseconds。incrementallyRehash的時間較長，rehash的個數也比較多。這裡每次執行 1 millisecond rehash 操作；如果未完成 rehash，會在下一個 loop 裡面繼續執行。

/* Rehash for an amount of time between ms milliseconds and ms+1 milliseconds */
 
int dictRehashMilliseconds(dict *d, int ms) {
 
long long start = timeInMilliseconds();
 
int rehashes = 0;
 
while(dictRehash(d,100)) {
 
rehashes += 100;
 
if (timeInMilliseconds()-start > ms) break;
 
}
 
return rehashes;
 
}

lazy rehashing：_dictRehashStep中，也會呼叫dictRehash，而_dictRehashStep每次僅會rehash一個值從ht[0]到 ht[1]，但由於_dictRehashStep是被dictGetRandomKey、dictFind、 dictGenericDelete、dictAdd呼叫的，因此在每次dict增刪查改時都會被呼叫，這無疑就加快rehash了過程。

我們再來看看做rehash的方法。dictRehash每次增量rehash n個元素，由於在自動調整大小時已設定好了ht[1]的大小，因此rehash的主要過程就是遍歷ht[0]，取得key，然後將該key按ht[1]的桶的大小重新rehash，並在rehash完後將ht[0]指向ht[1],然後將ht[1]清空。在這個過程中rehashidx非常重要，它表示上次rehash時在ht[0]的下標位置。

可以看到，redis對dict的rehash是分批進行的，這樣不會阻塞請求，設計的比較優雅。

但是在呼叫dictFind的時候，可能需要對兩張dict表做查詢。唯一的優化判斷是，當key在ht[0]不存在且不在rehashing狀態時，可以速度返回空。如果在rehashing狀態，當在ht[0]沒值的時候，還需要在ht[1]裡查詢。

dictAdd的時候，如果狀態是rehashing，則把值插入到ht[1]，否則ht[0]

Redis關鍵點（rehash）

Redis關鍵點（rehash）

Redis關鍵點（自動bgrewriteaof）

Windows 部署 Redis 群集（轉）

redis學習（二）——String數據類型

redis學習（1）--- NoSQL介紹

（轉）Redis研究（一）—簡介

windows64系統下安裝 redis服務（詳細）

.NET中使用Redis之ServiceStack.Redis學習（一）安裝與簡單的運行

redis入門（3）redis的配置獲取和修改

Redis實戰（二）CentOS 7上搭建redis-3.0.2

從零開始搭建框架SSM+Redis+Mysql（二）之MAVEN項目搭建

在windows下安裝Redis步驟（收集）

redis安裝（一）

NoSQL初探之人人都愛Redis：（3）使用Redis作為消息隊列服務場景應用案例

redis學習（四）redis持久化之RDB、AOF

Redis入門--（一）簡介NoSQL

Python Django 集成Redis Sentinel（哨兵）集群開發秒殺系統

redis 實驗（一）安裝

redis 實驗（二）持續化

redis 實驗（三）主從復制

Redis關鍵點（rehash）

相關推薦