前言

我们日常在对于redis的使用中,经常会遇到一些问题

  1. 高可用问题,如何保证redis的持续高可用性。
  2. 容量问题,单实例redis内存无法无限扩充,达到32G后就进入了64位世界,性能下降。
  3. 并发性能问题,redis号称单实例10万并发,但也是有尽头的。

哈希槽的概念

Redis 集群中内置了 16384 个哈希槽,当需要在 Redis 集群中放置一个 key-value 时,redis 先对 key 使用 crc16 算法算出一个结果,然后把结果对 16384 求余数, 这样每个 key 都会对应一个编号在 0-16383 之间的哈希槽,redis 会根据节点数量大 致均等的将哈希槽映射到不同的节点。 Redis 集群没有使用一致性hash, 而是引入了哈希槽的概念。 Redis 集群有16384个哈希槽,每个key通过CRC16校验后对16384取模来决定放置哪个槽.集群的每个节点负责一部分hash槽。这种结构很容易添加或者删除节点,并且无论是添加删除或者修改某一个节点,都不会造成集群不可用的状态。 使用哈希槽的好处就在于可以方便的添加或移除节点。 当需要增加节点时,只需要把其他节点的某些哈希槽挪到新节点就可以了; 当需要移除节点时,只需要把移除节点上的哈希槽挪到其他节点就行了; 这样以后新增或移除节点的时候不用先停掉所有的 redis 服务而继续运行.

redis-cluster的优势

  1. 官方推荐,毋庸置疑。
  2. 去中心化,集群最大可增加1000个节点,性能随节点增加而线性扩展。
  3. 管理方便,后续可自行增加或摘除节点,移动分槽等等。
  4. 简单,易上手。

一.Redis集群配置参数

cluster-enabled <yes/no>: 如果是yes,表示启用集群,否则以单例模式启动
cluster-config-file < filename> : 可选,这不是一个用户可编辑的配置文件,这个文件是Redis集群节点自动持久化每次配置的改变,为了在启动的时候重新读取它。
cluster-node-timeout < milliseconds>: 超时时间,集群节点不可用的最大时间。如果一个master节点不可到达超过了指定时间,则认为它失败了。注意,每一个在指定时间内不能到达大多数master节点的节点将停止接受查询请求。
cluster-slave-validity-factor < factor>: 如果设置为0,则一个slave将总是尝试故障转移一个master。如果设置为一个正数,那么最大失去连接的时间是node timeout乘以这个factor。
cluster-migration-barrier < count>: 一个master和slave保持连接的最小数量(即:最少与多少个slave保持连接),也就是说至少与其它多少slave保持连接的slave才有资格成为master。
cluster-require-full-coverage < yes/no>: 如果设置为yes,这也是默认值,如果key space没有达到百分之多少时停止接受写请求。如果设置为no,将仍然接受查询请求,即使它只是请求部分key。

二. 创建Redis节点

(1) 创建不同配置文件节点

在redis目录下面创建8000到8005不同的5个节点,每个节点下面新建各种的redis.conf配置文件

cd redis
mkdir 8000 8001 8002 8003 8004 8005

复制redis下的redis.conf到redis/8000/下面,其他端口号的依次执行

cd redis
cp redis.conf /redis/8000/
(2) redis.conf配置文件

以端口号8000为例,其他节点的只是改下端口号

protected-mode no
port 8000
cluster-enabled yes
cluster-config-file nodes-8000.conf
cluster-node-timeout 5000
daemonize yes
pidfile /var/run/redis_8000.pid
logfile "8000.log"
dir /redis/data
bind 127.0.0.1
(3)依次启动各端口实例
cd redis
cd 8000/redis-server redis.conf

其他端口依次执行
执行后查看端口号

[root@localhost 8005]# ps -ef|grep redis
[root@localhost 8005]# ps -ef|grep redis
root       3140      1  0 18:11 ?        00:00:00 redis-server 127.0.0.1:8000 [cluster]
root       3153      1  0 18:12 ?        00:00:00 redis-server 127.0.0.1:8001 [cluster]
root       3158      1  0 18:12 ?        00:00:00 redis-server 127.0.0.1:8002 [cluster]
root       3163      1  0 18:12 ?        00:00:00 redis-server 127.0.0.1:8003 [cluster]
root       3168      1  0 18:12 ?        00:00:00 redis-server 127.0.0.1:8004 [cluster]
root       3173      1  0 18:12 ?        00:00:00 redis-server 127.0.0.1:8005 [cluster]
root       3178   2043  0 18:12 pts/0    00:00:00 grep --color=auto redis

三.搭建集群

(1) 安装ruby

比较简单的是用redis-trib工具,它在src目录下。它是一个ruby程序,所以需要先安装ruby

yum install ruby
yum install rubygems
gem install redis

执行最后一条命令可能出现问题如下:

[root@localhost /]# gem install redis
Fetching: redis-4.0.3.gem (100%)
ERROR:  Error installing redis:
	redis requires Ruby version >= 2.2.2.
(2).升级ruby

(1) 安装curl

sudo yum install curl

(2) 秘钥key

gpg --keyserver hkp://keys.gnupg.net --recv-keys 409B6B1796C275462A1703113804BB82D39DC0E3 7D2BAF1CF37B13E2069D6956105BD0E739499BDB

(3) 下载rvm

curl -sSL https://get.rvm.io | bash -s stable

(4) 查找配置文件

find / -name rvm.sh

(5) 配置文件生效

source /etc/profile.d/rvm.sh

(6) 下载rvm依赖

rvm requirements

(7) 查看rvm库ruby版本

rvm list known

(8) 安装ruby指定版本

rvm install ruby-2.4.4

(9) 使用ruby版本默认

rvm use 2.4.4 default

(10) 安装

gem install redis

3.其他相关操作

(1) 卸载rubygems
如果安装shibai,需要卸载,可以执行

yum remove rubygems -y

(2) 卸载ruby的redis扩展包

gem uninstall redis --version 3.3.3

(3) 查看redis路径

find / -name “redis”

(4) 查看gem的镜像地址

$ gem sources -l
*** CURRENT SOURCES ***

https://rubygems.org/

(5) 移动该镜像,并添加新镜像
删除原gem源

gem sources --remove https://rubygems.org/

添加国内源

gem sources -a https://gems.ruby-china.com

(3) 创建集群

Redis5.0集群管理工具redis-trib.rb已经被废弃,所以不用安装ruby啥的了,上面安装ruby的方法针对于redis5.0以下使用,当时redis-trib.rb的功能,现在已经集成到了redis-cli中,并且可以在有认证的情况执行了,可以通过./redis-cli --cluster help查看使用方式。
redis-cli 集群命令帮助文档

[root@localhost src]# redis-cli --cluster help
Cluster Manager Commands:
  create         host1:port1 ... hostN:portN
                 --cluster-replicas <arg>
  check          host:port
  info           host:port
  fix            host:port
  reshard        host:port
                 --cluster-from <arg>
                 --cluster-to <arg>
                 --cluster-slots <arg>
                 --cluster-yes
                 --cluster-timeout <arg>
                 --cluster-pipeline <arg>
  rebalance      host:port
                 --cluster-weight <node1=w1...nodeN=wN>
                 --cluster-use-empty-masters
                 --cluster-timeout <arg>
                 --cluster-simulate
                 --cluster-pipeline <arg>
                 --cluster-threshold <arg>
  add-node       new_host:new_port existing_host:existing_port
                 --cluster-slave
                 --cluster-master-id <arg>
  del-node       host:port node_id
  call           host:port command arg arg .. arg
  set-timeout    host:port milliseconds
  import         host:port
                 --cluster-from <arg>
                 --cluster-copy
                 --cluster-replace
  help

创建集群如下:

redis-cli --cluster create   127.0.0.1:8000 127.0.0.1:8001 127.0.0.1:8002 127.0.0.1:8003 127.0.0.1:8004 127.0.0.1:8005  --cluster-replicas 1

如果出现问题

[ERR] Node 127.0.0.1:8000 is not empty. Either the node already knows other nodes (check with CLUSTE

删除生成的配置文件nodes.conf,如果不行则说明现在创建的结点包括了旧集群的结点信息,需要删除redis的持久化文件后再重启redis,比如:appendonly.aof、dump.rdb

然后继续

[root@localhost 8005]# redis-cli --cluster create   127.0.0.1:8000 127.0.0.1:8001 127.0.0.1:8002 127.0.0.1:8003 127.0.0.1:8004 127.0.0.1:8005  --cluster-replicas 1
>>> Performing hash slots allocation on 6 nodes...
Master[0] -> Slots 0 - 5460
Master[1] -> Slots 5461 - 10922
Master[2] -> Slots 10923 - 16383
Adding replica 127.0.0.1:8003 to 127.0.0.1:8000
Adding replica 127.0.0.1:8004 to 127.0.0.1:8001
Adding replica 127.0.0.1:8005 to 127.0.0.1:8002
>>> Trying to optimize slaves allocation for anti-affinity
[WARNING] Some slaves are in the same host as their master
M: f70c555de0ab6863247271b03570dcb017748a1d 127.0.0.1:8000
   slots:[0-5460] (5461 slots) master
M: 1c2dd486c7f885f86501932c144da131c68fdcad 127.0.0.1:8001
   slots:[5461-10922] (5462 slots) master
M: 9f103d9ecc5506b095919154dfb80bc8cfbb414e 127.0.0.1:8002
   slots:[10923-16383] (5461 slots) master
S: 298c94aa55f1cdcf53918fd9301f617b6dca37f7 127.0.0.1:8003
   replicates 9f103d9ecc5506b095919154dfb80bc8cfbb414e
S: e57e483ae48a85d95e1374bf4ba486e7a4df256b 127.0.0.1:8004
   replicates f70c555de0ab6863247271b03570dcb017748a1d
S: 4c61c720fa60a3d507fe355ae27218b0ac096f8a 127.0.0.1:8005
   replicates 1c2dd486c7f885f86501932c144da131c68fdcad
Can I set the above configuration? (type 'yes' to accept): yes
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join
....
>>> Performing Cluster Check (using node 127.0.0.1:8000)
M: f70c555de0ab6863247271b03570dcb017748a1d 127.0.0.1:8000
   slots:[0-5460] (5461 slots) master
   1 additional replica(s)
S: e57e483ae48a85d95e1374bf4ba486e7a4df256b 127.0.0.1:8004
   slots: (0 slots) slave
   replicates f70c555de0ab6863247271b03570dcb017748a1d
M: 1c2dd486c7f885f86501932c144da131c68fdcad 127.0.0.1:8001
   slots:[5461-10922] (5462 slots) master
   1 additional replica(s)
S: 298c94aa55f1cdcf53918fd9301f617b6dca37f7 127.0.0.1:8003
   slots: (0 slots) slave
   replicates 9f103d9ecc5506b095919154dfb80bc8cfbb414e
M: 9f103d9ecc5506b095919154dfb80bc8cfbb414e 127.0.0.1:8002
   slots:[10923-16383] (5461 slots) master
   1 additional replica(s)
S: 4c61c720fa60a3d507fe355ae27218b0ac096f8a 127.0.0.1:8005
   slots: (0 slots) slave
   replicates 1c2dd486c7f885f86501932c144da131c68fdcad
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
[root@localhost 8005]#

四. 集群操作

(1)查看集群信息与集群节点信息

查看集群信息

127.0.0.1:8000> cluster info
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6
cluster_size:3
cluster_current_epoch:6
cluster_my_epoch:1
cluster_stats_messages_ping_sent:2902
cluster_stats_messages_pong_sent:1422
cluster_stats_messages_fail_sent:4
cluster_stats_messages_sent:4328
cluster_stats_messages_ping_received:1422
cluster_stats_messages_pong_received:1446
cluster_stats_messages_fail_received:7
cluster_stats_messages_received:2875

查看节点信息

127.0.0.1:8002> cluster nodes
298c94aa55f1cdcf53918fd9301f617b6dca37f7 127.0.0.1:8003@18003 slave 9f103d9ecc5506b095919154dfb80bc8cfbb414e 0 1544402618000 4 connected
1c2dd486c7f885f86501932c144da131c68fdcad 127.0.0.1:8001@18001 master - 0 1544402617072 2 connected 5461-10922
9f103d9ecc5506b095919154dfb80bc8cfbb414e 127.0.0.1:8002@18002 myself,master - 0 1544402616000 3 connected 10923-16383
e57e483ae48a85d95e1374bf4ba486e7a4df256b 127.0.0.1:8004@18004 master - 0 1544402618478 7 connected 0-5460
4c61c720fa60a3d507fe355ae27218b0ac096f8a 127.0.0.1:8005@18005 slave 1c2dd486c7f885f86501932c144da131c68fdcad 0 1544402617473 6 connected
f70c555de0ab6863247271b03570dcb017748a1d 127.0.0.1:8000@18000 slave e57e483ae48a85d95e1374bf4ba486e7a4df256b 0 1544402617000 7 connected
(2) 查看节点信息

如查看节点8001端口号的集群信息

[root@localhost src]# redis-cli --cluster check 127.0.0.1:8001
127.0.0.1:8001 (1c2dd486...) -> 1 keys | 5462 slots | 1 slaves.
127.0.0.1:8002 (9f103d9e...) -> 1 keys | 5461 slots | 1 slaves.
127.0.0.1:8004 (e57e483a...) -> 0 keys | 5461 slots | 0 slaves.
[OK] 2 keys in 3 masters.
0.00 keys per slot on average.
>>> Performing Cluster Check (using node 127.0.0.1:8001)
M: 1c2dd486c7f885f86501932c144da131c68fdcad 127.0.0.1:8001
   slots:[5461-10922] (5462 slots) master
   1 additional replica(s)
M: 9f103d9ecc5506b095919154dfb80bc8cfbb414e 127.0.0.1:8002
   slots:[10923-16383] (5461 slots) master
   1 additional replica(s)
M: e57e483ae48a85d95e1374bf4ba486e7a4df256b 127.0.0.1:8004
   slots:[0-5460] (5461 slots) master
S: 298c94aa55f1cdcf53918fd9301f617b6dca37f7 127.0.0.1:8003
   slots: (0 slots) slave
   replicates 9f103d9ecc5506b095919154dfb80bc8cfbb414e
S: 4c61c720fa60a3d507fe355ae27218b0ac096f8a 127.0.0.1:8005
   slots: (0 slots) slave
   replicates 1c2dd486c7f885f86501932c144da131c68fdcad
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

查看集群正常

(3) 客户端操作集群

redis-cli -p 8001

[root@localhost src]# redis-cli -p 8001
127.0.0.1:8001> get a
(error) MOVED 15495 127.0.0.1:8002
127.0.0.1:8001>

这边提示MOVED命令,则用如下:

[root@localhost src]# redis-cli -c -p 8001
127.0.0.1:8001> get a
-> Redirected to slot [15495] located at 127.0.0.1:8002
"aaaa"
127.0.0.1:8002>

可以看到,客户端连接加-c选项的时候,存储和提取key的时候不断在8001和8002之间跳转,这个称为客户端重定向。之所以发生客户端重定向,是因为Redis Cluster中的每个Master节点都会负责一部分的槽(slot),存取的时候都会进行键值空间计算定位key映射在哪个槽(slot)上,如果映射的槽(slot)正好是当前Master节点负责则直接存取,否则就跳转到其他Master节点负的槽(slot)中存取,这个过程对客户端是透明的。继续看下文的集群分区原理。

(4) 删除节点

删除端口号8000的节点:redis-cli --cluster del-node 127.0.0.1:8000

[root@localhost src]# redis-cli --cluster del-node 127.0.0.1:8000 f70c555de0ab6863247271b03570dcb017748a1d
>>> Removing node f70c555de0ab6863247271b03570dcb017748a1d from cluster 127.0.0.1:8000
>>> Sending CLUSTER FORGET messages to the cluster...
>>> SHUTDOWN the node.

删除后再次查看下节点信息

127.0.0.1:8001> cluster nodes
9f103d9ecc5506b095919154dfb80bc8cfbb414e 127.0.0.1:8002@18002 master - 0 1544406706289 3 connected 10923-16383
1c2dd486c7f885f86501932c144da131c68fdcad 127.0.0.1:8001@18001 myself,master - 0 1544406705000 2 connected 5461-10922
e57e483ae48a85d95e1374bf4ba486e7a4df256b 127.0.0.1:8004@18004 master - 0 1544406705000 7 connected 0-5460
298c94aa55f1cdcf53918fd9301f617b6dca37f7 127.0.0.1:8003@18003 slave 9f103d9ecc5506b095919154dfb80bc8cfbb414e 0 1544406705284 4 connected
4c61c720fa60a3d507fe355ae27218b0ac096f8a 127.0.0.1:8005@18005 slave 1c2dd486c7f885f86501932c144da131c68fdcad 0 1544406706000 6 connected

发现端口号8000的节点已不存在

(5) 添加节点

将刚才删除的端口8000的节点添加到集群中的8001的主节点

[root@localhost 8005]# redis-cli --cluster add-node 127.0.0.1:8000 127.0.0.1:8001
>>> Adding node 127.0.0.1:8000 to cluster 127.0.0.1:8001
>>> Performing Cluster Check (using node 127.0.0.1:8001)
M: 1c2dd486c7f885f86501932c144da131c68fdcad 127.0.0.1:8001
   slots:[5461-10922] (5462 slots) master
   1 additional replica(s)
M: e57e483ae48a85d95e1374bf4ba486e7a4df256b 127.0.0.1:8004
   slots:[0-5460] (5461 slots) master
M: 9f103d9ecc5506b095919154dfb80bc8cfbb414e 127.0.0.1:8002
   slots:[10923-16383] (5461 slots) master
   1 additional replica(s)
S: 4c61c720fa60a3d507fe355ae27218b0ac096f8a 127.0.0.1:8005
   slots: (0 slots) slave
   replicates 1c2dd486c7f885f86501932c144da131c68fdcad
S: 298c94aa55f1cdcf53918fd9301f617b6dca37f7 127.0.0.1:8003
   slots: (0 slots) slave
   replicates 9f103d9ecc5506b095919154dfb80bc8cfbb414e
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
>>> Send CLUSTER MEET to node 127.0.0.1:8000 to make it join the cluster.
[OK] New node added correctly.
[root@localhost 8005]# redis-cli -c -p 8000
127.0.0.1:8000> cluster nodes
298c94aa55f1cdcf53918fd9301f617b6dca37f7 127.0.0.1:8003@18003 slave 9f103d9ecc5506b095919154dfb80bc8cfbb414e 0 1544411418529 3 connected
e57e483ae48a85d95e1374bf4ba486e7a4df256b 127.0.0.1:8004@18004 master - 0 1544411418000 7 connected 0-5460
ef3394bfe7574d617960422a9f3b7009cd2923eb 127.0.0.1:8000@18000 myself,master - 0 1544411417000 0 connected
1c2dd486c7f885f86501932c144da131c68fdcad 127.0.0.1:8001@18001 master - 0 1544411416519 2 connected 5461-10922
9f103d9ecc5506b095919154dfb80bc8cfbb414e 127.0.0.1:8002@18002 master - 0 1544411418328 3 connected 10923-16383
4c61c720fa60a3d507fe355ae27218b0ac096f8a 127.0.0.1:8005@18005 slave 1c2dd486c7f885f86501932c144da131c68fdcad 0 1544411416820 2 connected
127.0.0.1:8000>

查询下节点信息,已经添加成功

(6) 平衡各节点槽数量
[root@localhost 8005]# redis-cli --cluster rebalance --cluster-threshold 1 127.0.0.1:8000
>>> Performing Cluster Check (using node 127.0.0.1:8000)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
*** No rebalancing needed! All nodes are within the 1.00% threshold.

提示无需平衡,