环境说明:
源库:mongo-single.mongo-single.svc.cluster.local:27017
目标库: mongo-1.mongo.mongo.svc.cluster.local:27017
源库和目标库使用账号需要有readWrite权限,密码中不能包含@符号
mongo-shake客户端(可以是任意一台机器):集群内任意启动一个pod安装MongoShake即可

一、两种部署方式:
1>集群内部署方式:

kubectl create deployment mongo-shake --image=duanshuaixing02/tools:mongo-shake-v2.6.4

2>手动部署MongoShake

wget https://github.com/alibaba/MongoShake/releases/download/release-v2.6.4-20210414/mongo-shake-v2.6.4_2.tar.gz

2、修改MongoShake配置文件

tar -xf mongo-shake-v2.6.4_2.tar.gz
cd mongo-shake-v2.6.4/

需要修改的参数

源库(PRIMARY节点): mongo_urls
目标库(需要开启--replSet=rs0): tunnel.address
同步方式:sync_mode
同步模式,all表示全量+增量同步,full表示全量同步,incr表示增量同步。
配置需要同步的数据库白名单,以分号分隔;filter.namespace.white
mongoshake.ckpt_default存储的表的名字,如果启动多个mongoshake拉取同一个源可以修改这个表名以防止冲突;如果仍想运行完全同步但检查点存在,mongoshake.ckpt_default则应手动删除检查点并删除原有需要同步的数据库。

mongo_urls = mongodb://root:123456@mongo-single.mongo-single.svc.cluster.local:27017
tunnel.address = mongodb://root:shantanubansal@mongo-1.mongo.mongo.svc.cluster.local:27017
sync_mode = all
filter.namespace.white =duanshuaixing-src-db1;duanshuaixing-src-db2
checkpoint.storage.collection = duanshuaixing-sync-1

3、执行同步命令

cd mongo-shake-v2.6.4/
./collector.linux -conf=collector.conf -verbose 1 
或者指定脚本后台运行:
bash start.sh ./collector.conf
#一般迁移是增量同步,直到迁移结束并且切换了访问地址
结束后台执行可以kill掉相关进程,本示例中镜像守护进程是sleep 其余都可以kill

4、另开一个窗口查看迁移进度

cd mongo-shake-v2.6.4/
./mongoshake-stat --port=9100

详细配置参考:https://help.aliyun.com/document_detail/122621.html

二、单节点mongo没有开启–replSet=rs0场景
1、通过deployment方式部署的mongo pod修改容器command启动参数(部署示例如下)

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mongo-single
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: "50Gi"
  volumeName: 
  storageClassName: nfs-storage
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: mongo-single
spec:
  replicas: 1
  selector:
    matchLabels:
      name: mongo-single
  template:
    metadata:
      labels:
        name: mongo-single
    spec:
      containers:
      - name: mongo-single
        image: mongo:4
        imagePullPolicy: IfNotPresent
        command:
          - mongod
          - "--replSet"
          - rs0
          - "--bind_ip"
          - 0.0.0.0
        ports:
        - containerPort: 27017
        volumeMounts:
        - name: mongo-pvc
          mountPath: /data/db
      volumes:
       - name: mongo-pvc
         persistentVolumeClaim:
           claimName: mongo-single
---
kind: Service
apiVersion: v1
metadata:
  name: mongo-single
spec:
  type: NodePort
  ports:
  - name: mongo
    port: 27017
    nodePort: 30075
    targetPort: 27017
    protocol: TCP
  selector:
    name: mongo-single

2、创建用户,更多权限可以参考mongodb用户创建以及权限控制

创建root用户
use admin
db.createUser({user:"root",pwd:"123456",roles:[{role:"root",db:"admin" }]})

创建普通用户和数据库
use testdb
db.createUser({user:"testdb_user",pwd:"123456",roles:[{role:"readWrite",db:"testdb" }]})
db.testdb.insert({"blogURL":""})

3、修改后容器会重启,登录容器内初始化集群

use admin
db.auth('root','123456')
rs.initiate({ _id: "rs0", members: [ { _id: 0, host : "mongo-single.mongo-single.svc.cluster.local:27017" } ] } )
rs.config()
rs.status()

faq:
1、修改启动项后pod 启动失败提示如下报错

"msg":"DBException in initAndListen, terminating","attr":{"error":"DBPathInUse: Unable to lock the lock file: /data/db/mongod.lock (Resource temporarily unavailable). Another mongod instance is already running on the /data/db directory"}}

解决方法:
进入正常pod或者pvc内删除/data/db/mongod.lock 重启启动失败的pod

三、同步示例
1、环境信息

节点角色

IP地址

集群模式

mongodb-src-rook04

192.168.86.39:27017

rs

mongodb-dest-1-rook01

192.168.86.36:27017

rs

mongodb-dest-2-rook02

192.168.86.37:27017

rs

mongodb-dest-3-rook03

192.168.86.38:27017

rs

2、源库数据准备,共3个databases,4个collection

kubectl exec -it mongodb-src-rook04 bash
mongo 192.168.86.39:27019
use admin
db.auth('useradmin','adminPassw0rd')
use duanshuaixing-src-db1
for(i=1; i<=500000;i++){   db.user.insert( {name:'db1user'+i, age:i} ) }
use duanshuaixing-src-db2
for(i=1; i<=200000;i++){   db.user.insert( {name:'db2user'+i, age:i} ) }
for(i=1; i<=200000;i++){   db.user2.insert( {name:'db2user2'+i, age:i} ) }
use duanshuaixing-src-db3
for(i=1; i<=200000;i++){   db.user3.insert( {name:'db2user3'+i, age:i} ) }

3、部署mongo-shake

kubectl create deployment mongo-shake --image=duanshuaixing02/tools:mongo-shake-v2.6.4

4、配置mongoshake同步use duanshuaixing-src-db和use duanshuaixing-src-db2

kubectl exec -it mongo-shake-84977c55f8-tq8ks bash
cp collector.conf collector.conf.bak

配置mongo_urls为源数据主节点地址
mongo_urls = mongodb://useradmin:adminPassw0rd@192.168.86.39:27017

配置tunnel.address为目标数据库的用户名、密码、连接地址
tunnel.address = mongodb://useradmin:adminPassw0rd@192.168.86.36:27017,192.168.86.37:27017,192.168.86.38:27017

配置同步模式为全量+增量
sync_mode = all

配置需要同步的数据库白名单(分号分隔数据名称!!!)
filter.namespace.white =duanshuaixing-src-db1;duanshuaixing-src-db2

checkpoint存储的表的名字,如果启动多个mongoshake拉取同一个源可以修改这个表名以防止冲突,如果重新完整同步则需要删除目标库原有需要同步数据库并修改checkpoint名称
checkpoint.storage.collection = duanshuaixing-sync-1

5、启动同步和停止

sh start.sh ./collector.conf
ps -ef|grep collector.conf|awk '{print $2}'|xargs kill -9

6、在同步状态下源库插入数据

rs0:PRIMARY> use duanshuaixing-src-db2
switched to db duanshuaixing-src-db2
rs0:PRIMARY> show tables;
user
user2
rs0:PRIMARY> db.user.count()
200000
rs0:PRIMARY> db.user2.count()
200000

现阶段源库和目标库的两个collection都是20w条数据,目标库正常同步数据
use duanshuaixing-src-db2
for(i=1; i<=200000;i++){   db.user.insert( {name:'db2userage'+i, age:i} ) }

新建collection目标库也自动创建数据正常同步
for(i=1; i<=200000;i++){   db.user3.insert( {name:'db2user3'+i, age:i} ) }

7、目标库新建collection插入数据和在已有collection插入数据,源库数据不会被修改

use duanshuaixing-src-db2
for(i=1; i<=200000;i++){   db.user4.insert( {name:'db2user4'+i, age:i} ) }

for(i=1; i<=200000;i++){   db.user3.insert( {name:'db2user3-dest'+i, age:i} ) }

四、mongoshake状态监控和管理
参考:如何监控和管理MongoShake的运行状态 1、展示配置文件信息

curl -s  http://127.0.0.1:9101/conf | python -m json.tool

2、展示全量同步进度

curl -s  http://127.0.0.1:9101/progress | python -m json.tool

3、查看和添加限速

#查看当前限速配置,默认是0不限速
curl -s 127.0.0.1:9101/sentinel | python -m json.tool
#添加限速
curl -X POST --data '{"TPS":1000}' 127.0.0.1:9101/sentinel/options

六、FAQ
常见问题