
在上篇博客 Redis 数据结构底层 skiplist 中,了解了 Redis 的跳表,这篇博客来学习 Redis 中比较重要的数据结构—— ziplist(压缩链表)。


源码地址:3.0/src/ziplist.c(这次不是 .h 文件了,而是在 .c 文件的注释中)。


先说下 ziplist 是做什么的:

/* The ziplist is a specially encoded dually linked list that is designed
 * to be very memory efficient. It stores both strings and integer values,
 * where integers are encoded as actual integers instead of a series of
 * characters. 

简单翻译下,ziplist(压缩链表)是为了节约内存设计的经过特殊编码设计的双向链表。它可以存储字符串值和整数值,整数值是被按照真正的整数编码保存的,而不是被编码成一系列字符。(参考了 ziplist 结构详解



/* It allows push and pop operations on either side of the list
 * in O(1) time.

它可以在表的两端提供复杂度为 O(redis 存入的key的编码格式修改修改成utf8 redis默认编码_链表) 的 push 和 pop 操作。



 * The general layout of the ziplist is as follows:
 * <zlbytes><zltail><zllen><entry><entry><zlend>
 * <zlbytes> is an unsigned integer to hold the number of bytes that the ziplist occupies. 
 * This value needs to be stored to be able to resize the entire structure without the need to traverse it first.
 * <zltail> is the offset to the last entry in the list. 
 * This allows a pop operation on the far side of the list without the need for full traversal.
 * <zllen> is the number of entries.When this value is larger than 2**16-2, 
 * we need to traverse the entire list to know how many items it holds.
 * <zlend> is a single byte special value, equal to 255, which indicates the end of the list.

<zlbytes><zltail><zllen><entry>……<entry><zlend> 构成,先说下除了 <entry> 的四个:

  1. zlbytes:32bit,记录了当前 ziplist 占用的内存空间大小(可变),方便能够不在遍历整个 ziplist 结构获取占用空间大小的情况下进行内存重分配的实现。
  2. zltail:32bit,记录了当前 ziplist 表中最后一个结点距离压缩链表起始地址的偏移量,不用通过遍历就可以确定 ziplist 尾端元素的地址。
  3. zllen:16bit,记录了当前 ziplist 数据项(entry)的个数,当值大于 redis 存入的key的编码格式修改修改成utf8 redis默认编码_数据结构_02
  4. zlend:恒等于 255,代表 ziplist 的尾端。

上述所说的大小参见 3.0/src/ziplist.c 141 行:

/* Utility macros */
#define ZIPLIST_BYTES(zl)       (*((uint32_t*)(zl)))
#define ZIPLIST_TAIL_OFFSET(zl) (*((uint32_t*)((zl)+sizeof(uint32_t))))
#define ZIPLIST_LENGTH(zl)      (*((uint16_t*)((zl)+sizeof(uint32_t)*2)))
#define ZIPLIST_HEADER_SIZE     (sizeof(uint32_t)*2+sizeof(uint16_t))
#define ZIPLIST_ENTRY_TAIL(zl)  ((zl)+intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl)))
#define ZIPLIST_ENTRY_END(zl)   ((zl)+intrev32ifbe(ZIPLIST_BYTES(zl))-1)

再来具体说下 entry

 * Every entry in the ziplist is prefixed by a header that contains two pieces
 * of information. First, the length of the previous entry is stored to be
 * able to traverse the list from back to front. Second, the encoding with an
 * optional string length of the entry itself is stored.
 * The length of the previous entry is encoded in the following way:
 * If this length is smaller than 254 bytes, it will only consume a single
 * byte that takes the length as value. When the length is greater than or
 * equal to 254, it will consume 5 bytes. The first byte is set to 254 to
 * indicate a larger value is following. The remaining 4 bytes take the
 * length of the previous entry as value.
 * The other header field of the entry itself depends on the contents of the
 * entry. When the entry is a string, the first 2 bits of this header will hold
 * the type of encoding used to store the length of the string, followed by the
 * actual length of the string. When the entry is an integer the first 2 bits
 * are both set to 1. The following 2 bits are used to specify what kind of
 * integer will be stored after this header. An overview of the different
 * types and encodings is as follows:
 * |00pppppp| - 1 byte
 *      String value with length less than or equal to 63 bytes (6 bits).
 * |01pppppp|qqqqqqqq| - 2 bytes
 *      String value with length less than or equal to 16383 bytes (14 bits).
 * |10______|qqqqqqqq|rrrrrrrr|ssssssss|tttttttt| - 5 bytes
 *      String value with length greater than or equal to 16384 bytes.
 * |11000000| - 1 byte
 *      Integer encoded as int16_t (2 bytes).
 * |11010000| - 1 byte
 *      Integer encoded as int32_t (4 bytes).
 * |11100000| - 1 byte
 *      Integer encoded as int64_t (8 bytes).
 * |11110000| - 1 byte
 *      Integer encoded as 24 bit signed (3 bytes).
 * |11111110| - 1 byte
 *      Integer encoded as 8 bit signed (1 byte).
 * |1111xxxx| - (with xxxx between 0000 and 1101) immediate 4 bit integer.
 *      Unsigned integer from 0 to 12. The encoded value is actually from
 *      1 to 13 because 0000 and 1111 can not be used, so 1 should be
 *      subtracted from the encoded 4 bit value to obtain the right value.
 * |11111111| - End of ziplist.

每个结点前面都有一个 header,这个 header 包含了两类信息:


  • 如果上一个数据项占用字节数小于 254,则用 1 个字节来保存,字节值就是上一个数据项的占用字节数。
  • 如果上一个数据项占用字节数大于等于 254,则用 5 个字节表示。为了表示这种情况,第一个字节的值是 254,后面的 4 个字节组成一个数,存储前一个数据项的占用字节大小。

不是 255 的原因是 255 已经被用来表示 ziplist 尾端了。


  • 如果保存的是字符串,则头 2 位将保存编码字符串长度(大小)使用的类型,之后是字符串真正的长度;

1)|00pppppp| - 1 byte:字符串长度小于等于 63 字节(redis 存入的key的编码格式修改修改成utf8 redis默认编码_数据结构_03
2)|01pppppp|qqqqqqqq| - 2 bytes:字符串长度小于等于 16383 字节(redis 存入的key的编码格式修改修改成utf8 redis默认编码_redis_04
3)|10______|qqqqqqqq|rrrrrrrr|ssssssss|tttttttt| - 5 bytes:字符串长度大于等于 16384 字节(redis 存入的key的编码格式修改修改成utf8 redis默认编码_数据结构_05

  • 如果保存的是整数,那么头 2 位都会被设置为 1,后面两字节用来标识结点保存整数的类型。

1)|11000000| - 1 byte:2 个字节的 int16_t 类型整数
2)|11010000| - 1 byte:4 个字节的 int32_t 类型整数
3)|11100000| - 1 byte:8 个字节的 int64_t 类型整数
4)|11110000| - 1 byte:3 个字节长的整数
5)|11111110| - 1 byte:1 个字节长的整数
6)|1111xxxx| - (with xxxx between 0000 and 1101) immediate 4 bit integer:从 1 到 13 一共 13 个值,用 13 个值来保存真正的数据(数据而非数据长度)

参考 Redis内部数据结构详解,:

redis 存入的key的编码格式修改修改成utf8 redis默认编码_压缩链表_06

redis 存入的key的编码格式修改修改成utf8 redis默认编码_链表_07

3、 看下结构:

typedef struct zlentry {
	// 编码上一个 entry 长度用的字节大小,上一个 entry 的长度
    unsigned int prevrawlensize, prevrawlen;
    // 编码当前 entry 长度用的字节大小,当前 entry 的长度
    unsigned int lensize, len;
    // header 部分的大小,prevrawlensize + lensize
    unsigned int headersize;
    // 当前 entry 的编码方式
    unsigned char encoding;
    // 指向 entry 的指针,即 prev-entry-len 字段。
    unsigned char *p;
} zlentry;

具体的图就不画了,自己有几个地方捋顺不清,暂时先按照 Redis内部数据结构详解 此篇博客中的来理解吧(找了好久资料,都是把 entry 划分为 3 个部分来解释的,自己 C 的知识几乎没有,就先这样吧)



简单看了下 6.0/src/ziplist.c,和 3.0 的差别不是特别大,自己看 3.0 的时候也参照了 6.0 中的大量注释。


ziplist 有个缺点不得不提一下:连锁更新。

连锁更新,就是指当一个元素插入后,会引起当前位置元素新增 prevlensize 的空间。而当前位置元素的空间增加后,又会进一步引起该元素的后续元素,其 prevlensize 所需空间的增加