由于Redis的单线程服务模式,命令keys *会阻塞正常的业务请求,不建议生产环境使用该命令进行查询,可能会使服务器卡顿而出现事故。

SCAN命令

        Redis本身是基于Request/Response协议的,客户端发送一个命令,等待Redis应答,Redis在接收到命令,处理后应答。其中发送命令加上返回结果的时间称为(Round Time Trip)RRT-往返时间。如果客户端发送大量的命令给Redis,那就是等待上一条命令应答后再执行再执行下一条命令,这中间不仅仅多了RTT,而且还频繁的调用系统IO,发送网络请求。常规方式的删除10W个key需耗时68.7秒,对于百万甚至亿级数据,删除效率很低。

        Pipeline(流水线)功能极大的改善了上面的缺点。Pipeline能将一组Redis命令进行组装,然后一次性传输给Redis,再将Redis执行这组命令的结果按照顺序返回给客户端,极大的提高了数据删除的效率。

代码

# encoding: utf-8
"""
author: yangyi@youzan.com
time: 2018/3/9 下午8:35
func:
"""
import redis
import random
import string
import rediscluster


import time

# 安装第三方集群库
# pip install redis-py-cluster

# 单点redis
# pool = redis.ConnectionPool(host='127.0.0.1', port=6379, db=0, password='xxx')
# r = redis.Redis(connection_pool=pool)  

# 集群主节点
startup_nodes = [{"host": "127.0.0.1", "port": 6379},{"host": "127.0.0.1", "port": "6380"},{"host": "127.0.0.1", "port": 6420}]
redis_store = rediscluster.RedisCluster(startup_nodes=startup_nodes, password="xxxxx", decode_responses=True, max_connections=200)

# 生成随机数方法
# def random_str():
#     return ''.join(random.choice(string.ascii_letters + string.digits) for _ in range(7))

# 写入随机key
# def init_keys():
#     start_time = time.time()
#     for i in xrange(0, 20):
#         key_name = 'dba_' + str(i)
#         value_name = random_str()
#         r.set(key_name, value_name)
#     print 'initial keys successfully,use time:', time.time() - start_time

# 不用流水线的方法删除key
# def del_keys_without_pipe():
#     start_time = time.time()
#     result_length = 0
#     for key in r.scan_iter(match='dba_*', count=2000):
#         r.delete(key)
#         result_length += 1
#     print "normal ways end at:", time.time() - start_time
#     print "normal ways delete numbers:", result_length

# 用流水线的方法删除key
def del_keys_with_pipe():
    start_time = time.time()
    result_length = 0
    pipe = redis_store.pipeline()
    for key in redis_store.scan_iter(match='SN:*', count=2000):
        # print key
        pipe.delete(key)
        result_length += 1
        if result_length % 2000 == 0:
            pipe.execute()
    pip_time = time.time()
    print "use pipeline scan time ", time.time() - start_time
    pipe.execute()
    print "use pipeline end at:", time.time() - pip_time
    print "use pipeline ways delete numbers:", result_length


def main():
     del_keys_with_pipe()


if __name__ == '__main__':
    main()