redis set数据量比较大 redis set最大数据量

转载

mob64ca140f67e3 2023-12-25 13:26:58

文章标签 redis set数据量比较大 redis 数据结构 c# 数据 文章分类 Redis 数据库

文章目录

1. 存储的结构
2. 源码分析

2.1 数据存储过程
2.2 数据存储结构 intset

2.2.1 intset 结构定义
2.2.2 intset 关键函数

2.3 数据存储结构 dict

2.3.1 dict 结构定义
2.3.2 dict 关键函数

1. 存储的结构

在 redis 集合对象 Set 的介绍中我们知道 redis 对于集合对象 Set 有以下两种存储形式，其内存结构如下所示：

OBJ_ENCODING_INTSET集合保存的所有元素都是整数值时将会采用这种存储结构，但当集合对象保存的元素数量超过512 (由server.set_max_intset_entries 配置)后会转化为 OBJ_ENCODING_HT
OBJ_ENCODING_HT 底层为 dict 字典，数据作为字典的键保存，键对应的值都是NULL，与 Java 中的 HashSet 类似

2. 源码分析

2.1 数据存储过程

Redis 中对操作 Set 集合的命令的处理在 t_set.c 文件中，一个入口函数为 t_set.c#saddCommand()。从其源码来看，主要完成的操作如下：

首先调用 lookupKeyWrite() 函数从数据库中查找以目标 key 存储的 redis 对象是否存在，如不存在则调用 setTypeCreate() 函数新建 set 类型 redis 对象，并调用 dbAdd() 函数将新建的 set 集合存入数据库
如数据库中存在目标 set 类型对象，则调用 setTypeAdd() 函数将本次要添加的数据增加到 set 集合中

void saddCommand(client *c) {
 robj *set;
 int j, added = 0;

 set = lookupKeyWrite(c->db,c->argv[1]);
 if (set == NULL) {
     set = setTypeCreate(c->argv[2]->ptr);
     dbAdd(c->db,c->argv[1],set);
 } else {
     if (set->type != OBJ_SET) {
         addReply(c,shared.wrongtypeerr);
         return;
     }
 }

 for (j = 2; j < c->argc; j++) {
     if (setTypeAdd(set,c->argv[j]->ptr)) added++;
 }
 if (added) {
     signalModifiedKey(c,c->db,c->argv[1]);
     notifyKeyspaceEvent(NOTIFY_SET,"sadd",c->argv[1],c->db->id);
 }
 server.dirty += added;
 addReplyLongLong(c,added);
}

t_set.c#setTypeCreate() 函数比较简单，主要是根据将要添加到集合中的值的类型来创建对应编码的 set 对象，可以看到对于可以转化为数字类型的 value 数据，将调用 createIntsetObject() 函数创建底层存储结构为 inset 的 set 对象，否则创建创建底层存储结构为哈希表的 set 对象

robj *setTypeCreate(sds value) {
 if (isSdsRepresentableAsLongLong(value,NULL) == C_OK)
     return createIntsetObject();
 return createSetObject();
}

以下为创建两种存储结构不同的 set 对象的函数，可以看到其实现是很简练的。此处只是简单介绍一下 set 对象创建过程，下文将详细分析其底层结构

object.c#createSetObject() 首先调用 dictCreate() 函数创建 dict 对象，再使用该对象来创建 set 集合对象，并将集合对象的编码设置为了 OBJ_ENCODING_HT
object.c#createIntsetObject() 调用 intsetNew() 函数创建 inset 对象，使用该结构来创建 set 对象，最后将集合对象的编码设置为 OBJ_ENCODING_INTSET

robj *createSetObject(void) {
 dict *d = dictCreate(&setDictType,NULL);
 robj *o = createObject(OBJ_SET,d);
 o->encoding = OBJ_ENCODING_HT;
 return o;
}

robj *createIntsetObject(void) {
 intset *is = intsetNew();
 robj *o = createObject(OBJ_SET,is);
 o->encoding = OBJ_ENCODING_INTSET;
 return o;
}

新建 set 对象的流程结束，回到往 set 集合中添加数据的函数 t_set.c#setTypeAdd() 。这个函数稍微长一点，不过流程是很清晰的：

首先判断需要添加数据的 set 集合对象的编码类型，如果是OBJ_ENCODING_HT，则说明其底层存储结构为哈希表，直接调用 dictAddRaw() 函数将 value 数据作为键，NULL 作为值插入即可。这其中会涉及到 rehash 扩容的问题，下文将详细分析
如果 set 集合的编码为 OBJ_ENCODING_INTSET，则其底层结构为 inset。与创建 set 对象时类似，这里也要判断添加的 value 数据是否可以解析为数字类型。如果是的话，则调用 intsetAdd() 函数添加数据到 inset 中，完成后需要判断当前 set 集合存储数据的数量，如果超过了 server.set_max_intset_entries 配置（set-max-intset-entries，默认 512），则需要调用函数 setTypeConvert() 将 set 集合转化为哈希表存储

int setTypeAdd(robj *subject, sds value) {
 long long llval;
 if (subject->encoding == OBJ_ENCODING_HT) {
     dict *ht = subject->ptr;
     dictEntry *de = dictAddRaw(ht,value,NULL);
     if (de) {
         dictSetKey(ht,de,sdsdup(value));
         dictSetVal(ht,de,NULL);
         return 1;
     }
 } else if (subject->encoding == OBJ_ENCODING_INTSET) {
     if (isSdsRepresentableAsLongLong(value,&llval) == C_OK) {
         uint8_t success = 0;
         subject->ptr = intsetAdd(subject->ptr,llval,&success);
         if (success) {
             /* Convert to regular set when the intset contains
              * too many entries. */
             if (intsetLen(subject->ptr) > server.set_max_intset_entries)
                 setTypeConvert(subject,OBJ_ENCODING_HT);
             return 1;
         }
     } else {
         /* Failed to get integer from object, convert to regular set. */
         setTypeConvert(subject,OBJ_ENCODING_HT);

         /* The set *was* an intset and this value is not integer
          * encodable, so dictAdd should always work. */
         serverAssert(dictAdd(subject->ptr,sdsdup(value),NULL) == DICT_OK);
         return 1;
     }
 } else {
     serverPanic("Unknown set encoding");
 }
 return 0;
}

2.2 数据存储结构 intset

2.2.1 intset 结构定义

intset 结构的定义在intset.h 文件中，其关键属性如下。intset 内部其实是一个数组，而且存储数据的时候是有序的，其数据查找是通过二分查找来实现的

encoding : 编码类型，根据整型位数分为 INTSET_ENC_INT16、INTSET_ENC_INT32、INTSET_ENC_INT64 三种编码
length：集合包含的元素数量
contents: 实际保存元素的数组

typedef struct intset {
    uint32_t encoding;
    uint32_t length;
    int8_t contents[];
} intset;

2.2.2 intset 关键函数

inset.c#intsetAdd() 是比较能突出 intset 结构特点的函数，其内部实现的重要逻辑如下：

判断添加的元素需要编码为何种数据类型，比较新元素的编码 valenc 与当前集合的编码 is->encoding。如果 valenc >is->encoding 表明当前集合无法存储新元素，需调用函数 intsetUpgradeAndAdd() 对集合进行编码升级，反之则集合无需升级
调用函数 intsetSearch() 判断新元素是否已经存在，不存在则调用函数 intsetResize() 扩充集合空间。如果新元素插入的位置小于 intset 长度，则需要调用 intsetMoveTail() 将目标位置之后的元素往后移动，以便为新元素腾出位置，最后调用 _intsetSet()函数将新元素插入指定位置

intset *intsetAdd(intset *is, int64_t value, uint8_t *success) {
 uint8_t valenc = _intsetValueEncoding(value);
 uint32_t pos;
 if (success) *success = 1;

 /* Upgrade encoding if necessary. If we need to upgrade, we know that
  * this value should be either appended (if > 0) or prepended (if < 0),
  * because it lies outside the range of existing values. */
 if (valenc > intrev32ifbe(is->encoding)) {
     /* This always succeeds, so we don't need to curry *success. */
     return intsetUpgradeAndAdd(is,value);
 } else {
     /* Abort if the value is already present in the set.
      * This call will populate "pos" with the right position to insert
      * the value when it cannot be found. */
     if (intsetSearch(is,value,&pos)) {
         if (success) *success = 0;
         return is;
     }

     is = intsetResize(is,intrev32ifbe(is->length)+1);
     if (pos < intrev32ifbe(is->length)) intsetMoveTail(is,pos,pos+1);
 }

 _intsetSet(is,pos,value);
 is->length = intrev32ifbe(intrev32ifbe(is->length)+1);
 return is;
}

inset.c#intsetUpgradeAndAdd() 函数比较简练，可以看到其内部流程如下：

首先将 intset 的 encoding 编码属性设置为新的值，然后调用 intsetResize() 函数计算新编码下整个 intset 所需的空间，重新为 intset 申请内存
最后将 intset 中的值按照顺序重新填入到新的 inset 中

static intset *intsetResize(intset *is, uint32_t len) {
 uint32_t size = len*intrev32ifbe(is->encoding);
 is = zrealloc(is,sizeof(intset)+size);
 return is;
}

static intset *intsetUpgradeAndAdd(intset *is, int64_t value) {
 uint8_t curenc = intrev32ifbe(is->encoding);
 uint8_t newenc = _intsetValueEncoding(value);
 int length = intrev32ifbe(is->length);
 int prepend = value < 0 ? 1 : 0;

 /* First set new encoding and resize */
 is->encoding = intrev32ifbe(newenc);
 is = intsetResize(is,intrev32ifbe(is->length)+1);

 /* Upgrade back-to-front so we don't overwrite values.
  * Note that the "prepend" variable is used to make sure we have an empty
  * space at either the beginning or the end of the intset. */
 while(length--)
     _intsetSet(is,length+prepend,_intsetGetEncoded(is,length,curenc));

 /* Set the value at the beginning or the end. */
 if (prepend)
     _intsetSet(is,0,value);
 else
     _intsetSet(is,intrev32ifbe(is->length),value);
 is->length = intrev32ifbe(intrev32ifbe(is->length)+1);
 return is;
}

inset.c#intsetSearch() 函数通过二分查找的方式查找指定 value ，并将其下标位置赋值给 pos 指针，从此处可以知道 intset 存储数据必然是有序的

static uint8_t intsetSearch(intset *is, int64_t value, uint32_t *pos) {
 int min = 0, max = intrev32ifbe(is->length)-1, mid = -1;
 int64_t cur = -1;

 /* The value can never be found when the set is empty */
 if (intrev32ifbe(is->length) == 0) {
     if (pos) *pos = 0;
     return 0;
 } else {
     /* Check for the case where we know we cannot find the value,
      * but do know the insert position. */
     if (value > _intsetGet(is,max)) {
         if (pos) *pos = intrev32ifbe(is->length);
         return 0;
     } else if (value < _intsetGet(is,0)) {
         if (pos) *pos = 0;
         return 0;
     }
 }

 while(max >= min) {
     mid = ((unsigned int)min + (unsigned int)max) >> 1;
     cur = _intsetGet(is,mid);
     if (value > cur) {
         min = mid+1;
     } else if (value < cur) {
         max = mid-1;
     } else {
         break;
     }
 }

 if (value == cur) {
     if (pos) *pos = mid;
     return 1;
 } else {
     if (pos) *pos = min;
     return 0;
 }
}

2.3 数据存储结构 dict

2.3.1 dict 结构定义

dict 的结构定义在本系列博文Redis 6.0 源码阅读笔记(3)-概述 Redis 重要数据结构及其 6 种数据类型有提及，此处不再赘述，感兴趣的读者可以点击链接查看

2.3.2 dict 关键函数

dict 底层的实现其实和 Java 中的 HashMap 是高度类似的，包括其容量始终为 2 的次幂，数据下标定位算法也是 hashcode & (size -1)。值得注意的是， redis 中 dict 底层哈希表的扩容实现与 Java 中的 HashMap 是不同的，redis 采用的是渐进式hash，下文将根据其关键函数分析

渐进式hashdict 中有两个 hash 表，数据最开始存储在 ht[0] 中，其为初始大小为 4 的 hash 表。一旦 ht[0] 中的size 大于等于 used，也就是 hash 表满了，则新建一个 size*2 大小的 hash 表 ht[1]。此时并不会直接将 ht[0] 中的数据复制进 ht[1] 中，而是在以后的操作(find，set，get等)中慢慢将数据复制进去，以后新添加的元素则添加进 ht[1]

dict.c#dictAdd() 函数是 dict 字典添加元素的入口，可以看到其内部逻辑主要是调用 dictAddRaw() 函数

dictAddRaw() 函数的实现简单明了：
首先调用 dictIsRehashing() 函数判断 dict 是否正在 rehash 中，判断依据是 dict-> rehashidx 属性。如果在 rehash 过程中，则调用 _dictRehashStep() 函数将 hash 表底层数组中某一个下标上的数据迁移到新的哈希表
调用 _dictKeyIndex() 函数判断哈希表中是否已经存在目标 key，存在则返回 NULL
如果在 rehash 过程中，则将元素添加到 rehash 新建的哈希表中

int dictAdd(dict *d, void *key, void *val)
{
 dictEntry *entry = dictAddRaw(d,key,NULL);

 if (!entry) return DICT_ERR;
 dictSetVal(d, entry, val);
 return DICT_OK;
}

dictEntry *dictAddRaw(dict *d, void *key, dictEntry **existing)
{
 long index;
 dictEntry *entry;
 dictht *ht;

 if (dictIsRehashing(d)) _dictRehashStep(d);

 /* Get the index of the new element, or -1 if
  * the element already exists. */
 if ((index = _dictKeyIndex(d, key, dictHashKey(d,key), existing)) == -1)
     return NULL;

 /* Allocate the memory and store the new entry.
  * Insert the element in top, with the assumption that in a database
  * system it is more likely that recently added entries are accessed
  * more frequently. */
 ht = dictIsRehashing(d) ? &d->ht[1] : &d->ht[0];
 entry = zmalloc(sizeof(*entry));
 entry->next = ht->table[index];
 ht->table[index] = entry;
 ht->used++;

 /* Set the hash entry fields. */
 dictSetKey(d, entry, key);
 return entry;
}

dict.c#_dictKeyIndex() 是定位元素下标的函数，其内部实现步骤如下

调用 _dictExpandIfNeeded() 函数判断是否需要扩展空间
因为可能存在 rehash 的情况，所以查找的时候是遍历 dict 的 ht 数组，从两个 hash 表中查找

static long _dictKeyIndex(dict *d, const void *key, uint64_t hash, dictEntry **existing)
{
 unsigned long idx, table;
 dictEntry *he;
 if (existing) *existing = NULL;

 /* Expand the hash table if needed */
 if (_dictExpandIfNeeded(d) == DICT_ERR)
     return -1;
 for (table = 0; table <= 1; table++) {
     idx = hash & d->ht[table].sizemask;
     /* Search if this slot does not already contain the given key */
     he = d->ht[table].table[idx];
     while(he) {
         if (key==he->key || dictCompareKeys(d, key, he->key)) {
             if (existing) *existing = he;
             return -1;
         }
         he = he->next;
     }
     if (!dictIsRehashing(d)) break;
 }
 return idx;
}

dict.c#_dictExpandIfNeeded() 是判断是否需要 rehash 的关键函数，其内部实现如下：

如果 dict 第一个哈希表容量为 0，直接调用 dictExpand() 函数初始化哈希表容量为 4
哈希扩容的影响因素有 3 个，满足则调用 dictExpand() 函数两倍扩容
dict 第一个哈希表存储的元素数量已经大于等于其底层数组大小
dict_can_resize 配置为 true 或者dict 第一个哈希表的负载大于 dict_force_resize_ratio 配置
dictExpand() 函数会新建一个 dictht 哈希表对象，并将其赋给 dict->ht[1]

#define DICT_HT_INITIAL_SIZE     4
static int dict_can_resize = 1;
static unsigned int dict_force_resize_ratio = 5;

static int _dictExpandIfNeeded(dict *d)
{
 /* Incremental rehashing already in progress. Return. */
 if (dictIsRehashing(d)) return DICT_OK;

 /* If the hash table is empty expand it to the initial size. */
 if (d->ht[0].size == 0) return dictExpand(d, DICT_HT_INITIAL_SIZE);

 /* If we reached the 1:1 ratio, and we are allowed to resize the hash
  * table (global setting) or we should avoid it but the ratio between
  * elements/buckets is over the "safe" threshold, we resize doubling
  * the number of buckets. */
 if (d->ht[0].used >= d->ht[0].size &&
     (dict_can_resize ||
      d->ht[0].used/d->ht[0].size > dict_force_resize_ratio))
 {
     return dictExpand(d, d->ht[0].used*2);
 }
 return DICT_OK;
}

int dictExpand(dict *d, unsigned long size)
{
 /* the size is invalid if it is smaller than the number of
  * elements already inside the hash table */
 if (dictIsRehashing(d) || d->ht[0].used > size)
     return DICT_ERR;

 dictht n; /* the new hash table */
 unsigned long realsize = _dictNextPower(size);

 /* Rehashing to the same table size is not useful. */
 if (realsize == d->ht[0].size) return DICT_ERR;

 /* Allocate the new hash table and initialize all pointers to NULL */
 n.size = realsize;
 n.sizemask = realsize-1;
 n.table = zcalloc(realsize*sizeof(dictEntry*));
 n.used = 0;

 /* Is this the first initialization? If so it's not really a rehashing
  * we just set the first hash table so that it can accept keys. */
 if (d->ht[0].table == NULL) {
     d->ht[0] = n;
     return DICT_OK;
 }

 /* Prepare a second hash table for incremental rehashing */
 d->ht[1] = n;
 d->rehashidx = 0;
 return DICT_OK;
}

本文章为转载内容，我们尊重原作者对文章享有的著作权。如有内容错误或侵权问题，欢迎原作者联系我们进行内容更正或删除文章。