由于项目的服务器分布在重庆,上海,台北,休斯顿,所以需要做异地容灾需求。当前的mysql,redis cluster,elastic search都在重庆的如果重庆停电了,整个应用都不能用了。

现在考虑第一步做重庆和上海的异地容灾,大概测试了一下重庆的几台服务器之间大概是13m/s的传输速度也就是说100M的局域网带宽,重庆到上海只有1.2m/s的传输速度,大概10M的局域网带宽。

第一个方案先考虑简单的  mysql 重庆上海主主同步  redis cluster的master节点默认都设置在重庆的服务器,slave都设置在上海服务器。es的主分片也设置在重庆,副本分片全部设置在上海。

 

如下是redis的扩容和数据迁移的方法

在trialrun的服务器上一共3台   15.99.72.164和15.99.72.165在重庆    15.15.181.147在上海

 

[root@sha-147 7005]# bin/redis-cli -c -h 15.15.181.147 -p 7006
15.15.181.147:7006> cluster nodes
c08e8c7faeede2220e621b2409061210e0b107ad 15.99.72.164:7001@17001 slave 421123bf7fb3a4061e34cab830530d87b21148ee 0 1577089232000 7 connected
733609c2fbecdd41f454363698514e2f72ee0208 15.15.181.147:7006@17006 myself,slave f452a66121e1e9c02b0ed28cafe03aaddb327c36 0 1577089230000 6 connected
31670db07d1bc7620a8f8254b26f2af00b04d1fd 15.99.72.164:7002@17002 slave 763a88d5328ab0ce07a312e726d78bb2141b5813 0 1577089234988 5 connected
f452a66121e1e9c02b0ed28cafe03aaddb327c36 15.99.72.165:7003@17003 master - 0 1577089235796 3 connected 5461-10922
421123bf7fb3a4061e34cab830530d87b21148ee 15.99.72.165:7004@17004 master - 0 1577089234000 7 connected 0-5460
763a88d5328ab0ce07a312e726d78bb2141b5813 15.15.181.147:7005@17005 master - 0 1577089232733 5 connected 10923-16383 
 
[root@cq-165 src]# /root/tools/redis-4.0.11/src/redis-trib.rb info 15.99.72.165:7003
15.99.72.165:7003 (f452a661...) -> 53254 keys | 5462 slots | 1 slaves.
15.15.181.147:7005 (763a88d5...) -> 53174 keys | 5461 slots | 1 slaves.
15.99.72.165:7004 (421123bf...) -> 53050 keys | 5461 slots | 1 slaves.
[OK] 159478 keys in 3 masters.
9.73 keys per slot on average.

 

 

之前安装的是三主三从,现在我需要在165上先安装一个7007 的master的节点加入之前的集群然后把15.15.181.147:7005@17005 master  的slots 全部迁移到165的7007节点

 

1,先在165上  mkdir -p /usr/local/redis-cluster/7007

由于之前165上安装过其他节点,直接  cd /usr/local/redis-ii/

cp -r bin /usr/local/redis-cluster/7007

然后进入之前安装的7004节点 cd /usr/local/redis-cluster/7004

cp redis.conf ../7007/

然后修改7007的相关配置

 

bind 15.99.72.165
protected-mode no
port 7007
daemonize yes
cluster-enabled yes
cluster-node-timeout 15000

 

保存配置后,启动7007这个节点  bin/redis-server ./redis.conf

 

然后把165:7007节点添加到之前的节点中

 

[root@cq-165 tools]# /root/tools/redis-4.0.11/src/redis-trib.rb add-node 15.99.72.165:7007 15.99.72.165:7003
>>> Adding node 15.99.72.165:7007 to cluster 15.99.72.165:7003
>>> Performing Cluster Check (using node 15.99.72.165:7003)
M: f452a66121e1e9c02b0ed28cafe03aaddb327c36 15.99.72.165:7003
   slots:5461-10922 (5462 slots) master
   1 additional replica(s)
M: 763a88d5328ab0ce07a312e726d78bb2141b5813 15.15.181.147:7005
   slots:10923-16383 (5461 slots) master
   1 additional replica(s)
M: 421123bf7fb3a4061e34cab830530d87b21148ee 15.99.72.165:7004
   slots:0-5460 (5461 slots) master
   1 additional replica(s)
S: 733609c2fbecdd41f454363698514e2f72ee0208 15.15.181.147:7006
   slots: (0 slots) slave
   replicates f452a66121e1e9c02b0ed28cafe03aaddb327c36
S: 31670db07d1bc7620a8f8254b26f2af00b04d1fd 15.99.72.164:7002
   slots: (0 slots) slave
   replicates 763a88d5328ab0ce07a312e726d78bb2141b5813
S: c08e8c7faeede2220e621b2409061210e0b107ad 15.99.72.164:7001
   slots: (0 slots) slave
   replicates 421123bf7fb3a4061e34cab830530d87b21148ee
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
>>> Send CLUSTER MEET to node 15.99.72.165:7007 to make it join the cluster.
[OK] New node added correctly.

 

 

再用cluster nodes命令查看当前节点,可以发现7007已经加入到了redis cluster中但是slot 数为0

 

15.15.181.147:7006> cluster nodes
8e134e67e4e83a613b90f67cc6e6b8d71c208886 15.99.72.165:7007@17007 master - 0 1577095695760 0 connected
c08e8c7faeede2220e621b2409061210e0b107ad 15.99.72.164:7001@17001 slave 421123bf7fb3a4061e34cab830530d87b21148ee 0 1577095693561 7 connected
733609c2fbecdd41f454363698514e2f72ee0208 15.15.181.147:7006@17006 myself,slave f452a66121e1e9c02b0ed28cafe03aaddb327c36 0 1577095691000 6 connected
31670db07d1bc7620a8f8254b26f2af00b04d1fd 15.99.72.164:7002@17002 slave 763a88d5328ab0ce07a312e726d78bb2141b5813 0 1577095695000 5 connected
f452a66121e1e9c02b0ed28cafe03aaddb327c36 15.99.72.165:7003@17003 master - 0 1577095694000 3 connected 5461-10922
421123bf7fb3a4061e34cab830530d87b21148ee 15.99.72.165:7004@17004 master - 0 1577095694763 7 connected 0-5460
763a88d5328ab0ce07a312e726d78bb2141b5813 15.15.181.147:7005@17005 master - 0 1577095691699 5 connected 10923-16383

 

接下来需要把15.15.181.147:7005@17005 master  的slots全部迁移到 15.99.72.165:7007@17007 master 上

 

 

迁移过程参考如下例子,由于我迁移的时候打印太多,没有拷贝粘贴进来,和下面除了ip 和port等等有区别级别上一样

重新分配master节点分配slot

将192.168.1.116:7000的slot全部分配(5461)给192.168.1.117:7000


[root@localhost redis-cluster]# ./redis-4.0.6/src/redis-trib.rb reshard 192.168.1.117:7000
How many slots do you want to move (from 1 to 16384)? 5461      # 分配多少数量的slot
What is the receiving node ID? a6d7dacd679a96fd79b7de552428a63610d620e6   # 上面那些数量的slot被哪个节点接收。这里填写192.168.1.117:7000节点ID
  Type 'all' to use all the nodes as source nodes for the hash slots.
  Type 'done' once you entered all the source nodes IDs.
Source node #1:0607089e5bb3192563bd8082ff230b0eb27fbfeb #指从哪个节点分配上面指定数量的slot。这里填写192.168.1.116:7000的ID。如果填写all,则表示从之前所有master节点中抽取上面指定数量的slot。
Source node #2:done
Do you want to proceed with the proposed reshard plan (yes/no)? yes
Moving slot 0 from 192.168.1.116:7000 to 192.168.1.117:7000: 
[ERR] Calling MIGRATE: ERR Syntax error, try CLIENT (LIST | KILL | GETNAME | SETNAME | PAUSE | REPLY)

解决报错


[root@localhost redis-cluster]# cp redis-4.0.6/src/redis-trib.rb redis-4.0.6/src/redis-trib.rb.bak

将redis-trib.rb文件中原来的
                source.r.client.call(["migrate",target.info[:host],target.info[:port],"",0,@timeout,:keys,*keys])
                    source.r.client.call(["migrate",target.info[:host],target.info[:port],"",0,@timeout,:replace,:keys,*keys])
改为
                source.r.call(["migrate",target.info[:host],target.info[:port],"",0,@timeout,"replace",:keys,*keys])
                    source.r.call(["migrate",target.info[:host],target.info[:port],"",0,@timeout,:replace,:keys,*keys])

[root@localhost redis-cluster]# cat redis-4.0.6/src/redis-trib.rb |grep  source.r.call      
                source.r.call(["migrate",target.info[:host],target.info[:port],"",0,@timeout,"replace",:keys,*keys])
                    source.r.call(["migrate",target.info[:host],target.info[:port],"",0,@timeout,:replace,:keys,*keys])

# 修改后继续报错
[root@localhost redis-cluster]# ./redis-4.0.6/src/redis-trib.rb reshard 192.168.1.117:7000
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
[WARNING] Node 192.168.1.117:7000 has slots in importing state (0).
[WARNING] Node 192.168.1.116:7000 has slots in migrating state (0).
[WARNING] The following slots are open: 0
>>>