Redis 源码分析(四)dict

dict

dict的数据定义如下:

 1/* This is our hash table structure. Every dictionary has two of this as we
 2 * implement incremental rehashing, for the old to the new table. */
 3typedef struct dictht {
 4    dictEntry **table;
 5    unsigned long size;
 6    unsigned long sizemask;
 7    unsigned long used;
 8} dictht;
 9
10typedef struct dict {
11    dictType *type;
12    void *privdata;
13    dictht ht[2];
14    long rehashidx; /* rehashing not in progress if rehashidx == -1 */
15    int16_t pauserehash; /* If >0 rehashing is paused (<0 indicates coding error) */
16} dict;

dict 定义了一个hash table对象,包括如下的字段:

字段 类型 说明
type dictType* 这里定义了一组对Key-Value的操作函数
privdata void* Key-Value操作函数需要用到的自定义数据
ht dictht [2] 哈希数组,这里定义了两个,用于dict扩容时的rehash
rehashidx long 值为-1时表示不在rehashing,否则记录下一次rehash要操作的hash table下标
pauserehash int16_t > 0 表示rehashing暂停,< 0 表示rehashing出错

数据保存在ht字段中,dictht类型解释如下:

字段 类型 说明
table dictEntry** 哈希表数组,通过dictType中的hashFunction哈希的结果与sizemask进行与操作,获得的既是该数组的下标
size unsigned long 数组的大小,值一定是2的n次方
sizemask unsigned long 快速计算哈希值对应的数组下标的掩码,值为size-1
used unsigned long 节点计数,记录这个hash table里面一共有多少个节点。

先上图,来窥一下dict的全貌 dict

先来看dict的创建:

 1/* Create a new hash table */
 2dict *dictCreate(dictType *type,
 3        void *privDataPtr)
 4{
 5    dict *d = zmalloc(sizeof(*d));
 6
 7    _dictInit(d,type,privDataPtr);
 8    return d;
 9}
10
11/* Initialize the hash table */
12int _dictInit(dict *d, dictType *type,
13        void *privDataPtr)
14{
15    _dictReset(&d->ht[0]);
16    _dictReset(&d->ht[1]);
17    d->type = type;
18    d->privdata = privDataPtr;
19    d->rehashidx = -1;
20    d->pauserehash = 0;
21    return DICT_OK;
22}

dictCreate为dict的数据结构分配空间并为各个变量赋初值。其中两个哈希表ht[0]和ht[1]起始都没有分配空间,table指针都赋为NULL。这意味着要等第一个数据插入时才会真正分配空间。

在看dict是如何操作的,我们先看看dict的查找:

 1dictEntry *dictFind(dict *d, const void *key)
 2{
 3    dictEntry *he;
 4    uint64_t h, idx, table;
 5
 6    if (dictSize(d) == 0) return NULL; /* dict is empty */
 7    if (dictIsRehashing(d)) _dictRehashStep(d);
 8    h = dictHashKey(d, key);
 9    for (table = 0; table <= 1; table++) {
10        idx = h & d->ht[table].sizemask;
11        he = d->ht[table].table[idx];
12        while(he) {
13            if (key==he->key || dictCompareKeys(d, key, he->key))
14                return he;
15            he = he->next;
16        }
17        if (!dictIsRehashing(d)) return NULL;
18    }
19    return NULL;
20}

上述dictFind的源码,根据dict当前是否正在重哈希,依次做了这么几件事:

  • 如果当前正在进行重哈希,那么将重哈希过程向前推进一步(即调用_dictRehashStep)。实际上,除了查找,插入和删除也都会触发这一动作。这就将重哈希过程分散到各个查找、插入和删除操作中去了,而不是集中在某一个操作中一次性做完。
  • 计算key的哈希值(调用dictHashKey,里面的实现会调用前面提到的hashFunction)。
  • 先在第一个哈希表ht[0]上进行查找。在table数组上定位到哈希值对应的位置(如前所述,通过哈希值与sizemask进行按位与),然后在对应的dictEntry链表上进行查找。查找的时候需要对key进行比较,这时候调用dictCompareKeys,它里面的实现会调用到前面提到的keyCompare。如果找到就返回该项。否则,进行下一步。
  • 判断当前是否在重哈希,如果没有,那么在ht[0]上的查找结果就是最终结果(没找到,返回NULL)。否则,在ht[1]上进行查找(过程与上一步相同)。

rehash是个什么过程呢?还是看代码

 1/* This function performs just a step of rehashing, and only if hashing has
 2 * not been paused for our hash table. When we have iterators in the
 3 * middle of a rehashing we can't mess with the two hash tables otherwise
 4 * some element can be missed or duplicated.
 5 *
 6 * This function is called by common lookup or update operations in the
 7 * dictionary so that the hash table automatically migrates from H1 to H2
 8 * while it is actively used. */
 9static void _dictRehashStep(dict *d) {
10    if (d->pauserehash == 0) dictRehash(d,1);
11}
12
13/* Performs N steps of incremental rehashing. Returns 1 if there are still
14 * keys to move from the old to the new hash table, otherwise 0 is returned.
15 *
16 * Note that a rehashing step consists in moving a bucket (that may have more
17 * than one key as we use chaining) from the old to the new hash table, however
18 * since part of the hash table may be composed of empty spaces, it is not
19 * guaranteed that this function will rehash even a single bucket, since it
20 * will visit at max N*10 empty buckets in total, otherwise the amount of
21 * work it does would be unbound and the function may block for a long time. */
22int dictRehash(dict *d, int n) {
23    int empty_visits = n*10; /* Max number of empty buckets to visit. */
24    if (!dictIsRehashing(d)) return 0;
25
26    while(n-- && d->ht[0].used != 0) {
27        dictEntry *de, *nextde;
28
29        /* Note that rehashidx can't overflow as we are sure there are more
30         * elements because ht[0].used != 0 */
31        assert(d->ht[0].size > (unsigned long)d->rehashidx);
32        while(d->ht[0].table[d->rehashidx] == NULL) {
33            d->rehashidx++;
34            if (--empty_visits == 0) return 1;
35        }
36        de = d->ht[0].table[d->rehashidx];
37        /* Move all the keys in this bucket from the old to the new hash HT */
38        while(de) {
39            uint64_t h;
40
41            nextde = de->next;
42            /* Get the index in the new hash table */
43            h = dictHashKey(d, de->key) & d->ht[1].sizemask;
44            de->next = d->ht[1].table[h];
45            d->ht[1].table[h] = de;
46            d->ht[0].used--;
47            d->ht[1].used++;
48            de = nextde;
49        }
50        d->ht[0].table[d->rehashidx] = NULL;
51        d->rehashidx++;
52    }
53
54    /* Check if we already rehashed the whole table... */
55    if (d->ht[0].used == 0) {
56        zfree(d->ht[0].table);
57        d->ht[0] = d->ht[1];
58        _dictReset(&d->ht[1]);
59        d->rehashidx = -1;
60        return 0;
61    }
62
63    /* More to rehash... */
64    return 1;
65}
  • dict先去判断步数已经走完,之后再判断是否表中所有的项都已经rehash过了。不满足则进行一次rehash。
  • rehash的过程很简单,主要分为两步:
    1. 根据rehashindex找到要准备rehash的节点
    2. 遍历节点中的所有项,并将其插入到ht[1]中,之后再从ht[0]中删除掉。
  • 如果rehash做完了(used==0)则ht[0] = ht[1] 并 重置 ht[1]

那么dict如何新增一条数据呢?我们可以看下dictAdd和dictReplace的代码

 1/* Add an element to the target hash table */
 2int dictAdd(dict *d, void *key, void *val)
 3{
 4    dictEntry *entry = dictAddRaw(d,key,NULL);
 5
 6    if (!entry) return DICT_ERR;
 7    dictSetVal(d, entry, val);
 8    return DICT_OK;
 9}
10
11/* Low level add or find:
12 * This function adds the entry but instead of setting a value returns the
13 * dictEntry structure to the user, that will make sure to fill the value
14 * field as they wish.
15 *
16 * This function is also directly exposed to the user API to be called
17 * mainly in order to store non-pointers inside the hash value, example:
18 *
19 * entry = dictAddRaw(dict,mykey,NULL);
20 * if (entry != NULL) dictSetSignedIntegerVal(entry,1000);
21 *
22 * Return values:
23 *
24 * If key already exists NULL is returned, and "*existing" is populated
25 * with the existing entry if existing is not NULL.
26 *
27 * If key was added, the hash entry is returned to be manipulated by the caller.
28 */
29dictEntry *dictAddRaw(dict *d, void *key, dictEntry **existing)
30{
31    long index;
32    dictEntry *entry;
33    dictht *ht;
34
35    if (dictIsRehashing(d)) _dictRehashStep(d);
36
37    /* Get the index of the new element, or -1 if
38     * the element already exists. */
39    if ((index = _dictKeyIndex(d, key, dictHashKey(d,key), existing)) == -1)
40        return NULL;
41
42    /* Allocate the memory and store the new entry.
43     * Insert the element in top, with the assumption that in a database
44     * system it is more likely that recently added entries are accessed
45     * more frequently. */
46    ht = dictIsRehashing(d) ? &d->ht[1] : &d->ht[0];
47    entry = zmalloc(sizeof(*entry));
48    entry->next = ht->table[index];
49    ht->table[index] = entry;
50    ht->used++;
51
52    /* Set the hash entry fields. */
53    dictSetKey(d, entry, key);
54    return entry;
55}
56
57/* Add or Overwrite:
58 * Add an element, discarding the old value if the key already exists.
59 * Return 1 if the key was added from scratch, 0 if there was already an
60 * element with such key and dictReplace() just performed a value update
61 * operation. */
62int dictReplace(dict *d, void *key, void *val)
63{
64    dictEntry *entry, *existing, auxentry;
65
66    /* Try to add the element. If the key
67     * does not exists dictAdd will succeed. */
68    entry = dictAddRaw(d,key,&existing);
69    if (entry) {
70        dictSetVal(d, entry, val);
71        return 1;
72    }
73
74    /* Set the new value and free the old one. Note that it is important
75     * to do that in this order, as the value may just be exactly the same
76     * as the previous one. In this context, think to reference counting,
77     * you want to increment (set), and then decrement (free), and not the
78     * reverse. */
79    auxentry = *existing;
80    dictSetVal(d, existing, val);
81    dictFreeVal(d, &auxentry);
82    return 0;
83}

其实过程也很简单:

  • 首先使用dictAddRaw添加一个新节点,若节点已存在则返回existing。

    • 先进行一步rehash
    • 通过调用_dictKeyIndex来获取插入key的index,在没有rehash的情况下会触发表扩容,而扩容总是成倍增加hashtable的长度。
    • 由这行代码可知,如果正在rehashing则会把所有的数据都添加到ht[1]中,否则就插入到ht[0]
      1ht = dictIsRehashing(d) ? &d->ht[1] : &d->ht[0];  
      
  • 使用dictSetVal赋值,这是个宏,扩展后如下:

    1#define dictSetVal(d, entry, _val_) do { \
    2    if ((d)->type->valDup) \
    3        (entry)->v.val = (d)->type->valDup((d)->privdata, _val_); \
    4    else \
    5        (entry)->v.val = (_val_); \
    6} while(0)
    
  • 新增的dictEntry总是被插入到链表的头部,我想这应该是单向链表最快的插入方式。有文章说是因为新插入数据更容易被访问,我并不认同。

  • dictReplace还会使用dictFreeVal对原值进行释放。

    1auxentry = *existing;
    2dictSetVal(d, existing, val);
    3dictFreeVal(d, &auxentry);
    

另外我们还需要关注一下dict扩充hash table的条件。来看下dict的扩充函数

 1/* Expand the hash table if needed */
 2static int _dictExpandIfNeeded(dict *d)
 3{
 4    /* Incremental rehashing already in progress. Return. */
 5    if (dictIsRehashing(d)) return DICT_OK;
 6
 7    /* If the hash table is empty expand it to the initial size. */
 8    if (d->ht[0].size == 0) return dictExpand(d, DICT_HT_INITIAL_SIZE);
 9
10    /* If we reached the 1:1 ratio, and we are allowed to resize the hash
11     * table (global setting) or we should avoid it but the ratio between
12     * elements/buckets is over the "safe" threshold, we resize doubling
13     * the number of buckets. */
14    if (d->ht[0].used >= d->ht[0].size &&
15        (dict_can_resize ||
16         d->ht[0].used/d->ht[0].size > dict_force_resize_ratio) &&
17        dictTypeExpandAllowed(d))
18    {
19        return dictExpand(d, d->ht[0].used + 1);
20    }
21    return DICT_OK;
22}

可以看到的是,只要used > size则会触发扩容,也就是说redis不会让一个ht的索引下挂载过多的节点。因为列表遍历还是比hash来的要慢一点。

dict的数据结构和算法相对来说都是比较简单的,但是其在redis中却占据了重要的位置,很多地方都可以看到对dict的使用。

评论