一.primary节点转为secondary节点

1.优先级设置

  • 每个节点的优先级默认为1
// 在primary节点上设置优先级
cfg = rs.conf()

// 将某个节点priority设置为最大的时候,通过rs.reconfig()后会自动选举次节点为primary
cfg.members[1].priority = 10
rs1:PRIMARY> rs.reconfig(cfg)
{
        "ok" : 1,
        "operationTime" : Timestamp(1643180119, 1),
        "$clusterTime" : {
                "clusterTime" : Timestamp(1643180119, 1),
                "signature" : {
                        "hash" : BinData(0,"pro15qiUTk5MzcyGRJ+g9e5Sc30="),
                        "keyId" : NumberLong("7057052573554966530")
                }
        }
}

2.强制将primary退化为secondary节点

  • rs.stepDown(stepDownSecs, secondaryCatchUpPeriodSecs):在seconds时间内,这个实例不会把自己选为primary角色
  • 所有节点正常,且优先级不一样的情况下,将强制降级成功之后,又会重新检测,重新选举节点正常且优先级大的为primary节点,也就是说会自动切回
  • 在节点可选举为primary中且优先级一样的情况下执行rs.stepDown()则不会自动回切
  • 使用rs.stepDown()不可以指定切换,可以结合rs.freeze(seconds)控制选举的哪个为primary节点
-- 优先级:10,5,3
-- 在当前有优先级为10的节点执行rs.stepDown()
rs1:PRIMARY> rs.stepDown(10)
2022-01-27T09:45:21.007+0800 E QUERY    [js] Error: error doing query: failed: network error while attempting to run command 'replSetStepDown' on host '172.16.0.104:27017'  :
DB.prototype.runCommand@src/mongo/shell/db.js:168:1
DB.prototype.adminCommand@src/mongo/shell/db.js:185:1
rs.stepDown@src/mongo/shell/utils.js:1433:12
@(shell):1:1
2022-01-27T09:45:21.009+0800 I NETWORK  [js] trying reconnect to 172.16.0.104:27017 failed
2022-01-27T09:45:21.009+0800 I NETWORK  [js] reconnect 172.16.0.104:27017 ok
rs1:SECONDARY>

-- 重新查看rs.status发现primary重新切回
{
                        "_id" : 1,
                        "name" : "172.16.0.104:27017",
                        "health" : 1,
                        "state" : 1,
                        "stateStr" : "PRIMARY",
                        "uptime" : 87875,
                        "optime" : {
                                "ts" : Timestamp(1643247953, 1),
                                "t" : NumberLong(14)
                        },
                        "optimeDate" : ISODate("2022-01-27T01:45:53Z"),
                        "syncingTo" : "",
                        "syncSourceHost" : "",
                        "syncSourceId" : -1,
                        "infoMessage" : "",
                        "electionTime" : Timestamp(1643247931, 1),
                        "electionDate" : ISODate("2022-01-27T01:45:31Z"),
                        "configVersion" : 9,
                        "self" : true,
                        "lastHeartbeatMessage" : ""
}

// 利用rs.freeze()冻结某节点在指定时间内不可以选举为primary
// 当前所有节点优先级都为1,指定secondary节点C为primary

// 1)在secondary节点A中执行
rs1:SECONDARY> rs.freeze(300)
{
        "ok" : 1,
        "operationTime" : Timestamp(1643248827, 1),
        "$clusterTime" : {
                "clusterTime" : Timestamp(1643248827, 1),
                "signature" : {
                        "hash" : BinData(0,"ISguX2OearE9kZfo+2CHdIF+75c="),
                        "keyId" : NumberLong("7057052573554966530")
                }
        }
}

// 2)在primary节点B中执行
rs1:PRIMARY> rs.stepDown()
2022-01-27T10:00:40.991+0800 E QUERY    [js] Error: error doing query: failed: network error while attempting to run command 'replSetStepDown' on host '172.16.0.103:27017'  :
DB.prototype.runCommand@src/mongo/shell/db.js:168:1
DB.prototype.adminCommand@src/mongo/shell/db.js:185:1
rs.stepDown@src/mongo/shell/utils.js:1433:12
@(shell):1:1
2022-01-27T10:00:40.994+0800 I NETWORK  [js] trying reconnect to 172.16.0.103:27017 failed
2022-01-27T10:00:40.994+0800 I NETWORK  [js] reconnect 172.16.0.103:27017 ok

3.强制secondary节点进入维护模式

  • secondary节点进入维护模式之后,转态转化为recovering,在这个状态下的节点,客户端不会发送读请求给它,同时也 不能作为复制源
  • 维护模式的触发:自动触发和手动触发
// 手动触发
rs1:SECONDARY> db.adminCommand({"replSetMaintenance":true})
{
        "ok" : 1,
        "operationTime" : Timestamp(1643262934, 1),
        "$clusterTime" : {
                "clusterTime" : Timestamp(1643262934, 1),
                "signature" : {
                        "hash" : BinData(0,"h27nqPg/e02vLq3I/JokDrvwMXM="),
                        "keyId" : NumberLong("7057052573554966530")
                }
        }
}

// 处于RECOVERING状态
rs1:RECOVERING> use test
switched to db test

rs1:RECOVERING> db.blog.find()
Error: error: {
        "operationTime" : Timestamp(1643262954, 1),
        "ok" : 0,
        "errmsg" : "not master and slaveOk=false",
        "code" : 13435,
        "codeName" : "NotMasterNoSlaveOk",
        "$clusterTime" : {
                "clusterTime" : Timestamp(1643262954, 1),
                "signature" : {
                        "hash" : BinData(0,"lyYKWv91hsfUWLCvdqdSDB0BNZw="),
                        "keyId" : NumberLong("7057052573554966530")
                }
        }
}

二.mongodb副本集的同步

1.强制当前成员从指定成员同步数据

  • 默认都是从primary节点同步到各个secondary节点,通过rs.status()中syncingTo字段信息可以查看节点同步来源
  • 通过rs.syncFrom(“host:port”)可以指定当前节点同步数据的来源
rs1:SECONDARY> rs.syncFrom("172.16.0.103:27017")
{
        "syncFromRequested" : "172.16.0.103:27017",
        "prevSyncTarget" : "172.16.0.105:27017",
        "ok" : 1,
        "operationTime" : Timestamp(1643263354, 1),
        "$clusterTime" : {
                "clusterTime" : Timestamp(1643263354, 1),
                "signature" : {
                        "hash" : BinData(0,"W/DndMPHNfHr1nyZiTFLOhKk1gs="),
                        "keyId" : NumberLong("7057052573554966530")
                }
        }
}

// 查看同步来源
rs1:SECONDARY> rs.status().members[1].syncingTo
172.16.0.103:27017

2.禁用链式复制

  • 2.0版本开始支持链式复制,默认开启,是根据secondary节点之间的ping time和网络距离最近选择那个secondary作为数据同步的节点
  • 可以减少primary的资源消耗,较少负载,但是会增加节点之间的数据同步延迟
// 设置是否开启链式复制
cfg = rs.config()
cfg.setting.chainingAllowed = false|true
rs.reconfig(cfg)

3.数据同步的方式

  • initial sync:初始化,也可以理解为全量同步。
  • initial sync会在为每个集合复制文档时构建所有集合索引,3.4之前版本进_id在此阶段构建索引
    3.4之后initial sync复制数据的时间会将新增的oplog记录存到本地
    1)oplog为空,如新节点加入
    2)local.replset.minvalid集合_initialSyncFlog字段设置为true(用于init sync失败处理)
    3)内存标记initialSyncRequest设置为true(用于resync命令,resync命令只用于master/slave架构,副本集无法使用)
  • replication:sync oplog,不断重放primary的oplog同步增量数据
  • producer thread:负责不断的从同步源上拉取oplog,并加入一个BlockQueue的队列里保存着,BlockQueue最大存储240MB的oplog数据,当超过这个阈值时,必须等到oplog被replBatcher消费掉才能继续拉取
  • replBatcher thread:负责逐个从producer thread的队列里面取出oplog,并放到自己维护的队列里, 这个队列最多允许5000个元素,并且元素总大小不超过512MB,当队列满了时,就需要等待oplogApplication消费掉
  • oplogApplication:负责取出replBatcher thread当前队列的所有元素,并将元素根据docld(如果存储不支持文档锁,则根据集合名称)分散到不通过的replWriter线程,replWriter线程所有的oplog应用到自身,等待所有oplog都应用完毕,oplogApplication线程将所有的oplog顺序写入到local.oplog.rs集合
// producer的buffer和apply线程统计信息查询
db.serverStatus().metrics.repl

三.机器宕机,只剩一个secondary节点时,强制secondary为primary

  • 三节点的副本集,任何一台故障,集群都会进行自动的切换,不影响的服务
  • 三节点的副本集,故障任何两个节点,集群会变的不可用,需要手动处理
// 机器宕机一个secondary和primary不可用
rs1:SECONDARY> rs.status()
{
        "set" : "rs1",
        "date" : ISODate("2022-01-28T08:20:27.447Z"),
        "myState" : 2,
        "term" : NumberLong(16),
        "syncingTo" : "",
        "syncSourceHost" : "",
        "syncSourceId" : -1,
        "heartbeatIntervalMillis" : NumberLong(2000),
        "optimes" : {
                "lastCommittedOpTime" : {
                        "ts" : Timestamp(1643357433, 1),
                        "t" : NumberLong(16)
                },
                "readConcernMajorityOpTime" : {
                        "ts" : Timestamp(1643357433, 1),
                        "t" : NumberLong(16)
                },
                "appliedOpTime" : {
                        "ts" : Timestamp(1643357433, 1),
                        "t" : NumberLong(16)
                },
                "durableOpTime" : {
                        "ts" : Timestamp(1643357433, 1),
                        "t" : NumberLong(16)
                }
        },
        "lastStableCheckpointTimestamp" : Timestamp(1643357433, 1),
        "members" : [
                {
                        "_id" : 0,
                        "name" : "172.16.0.103:27017",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
                        "uptime" : 197952,
                        "optime" : {
                                "ts" : Timestamp(1643357433, 1),
                                "t" : NumberLong(16)
                        },
                        "optimeDate" : ISODate("2022-01-28T08:10:33Z"),
                        "syncingTo" : "",
                        "syncSourceHost" : "",
                        "syncSourceId" : -1,
                        "infoMessage" : "could not find member to sync from",
                        "configVersion" : 10,
                        "self" : true,
                        "lastHeartbeatMessage" : ""
                },
                {
                        "_id" : 1,
                        "name" : "172.16.0.104:27017",
                        "health" : 0,
                        "state" : 8,
                        "stateStr" : "(not reachable/healthy)",
                        "uptime" : 0,
                        "optime" : {
                                "ts" : Timestamp(0, 0),
                                "t" : NumberLong(-1)
                        },
                        "optimeDurable" : {
                                "ts" : Timestamp(0, 0),
                                "t" : NumberLong(-1)
                        },
                        "optimeDate" : ISODate("1970-01-01T00:00:00Z"),
                        "optimeDurableDate" : ISODate("1970-01-01T00:00:00Z"),
                        "lastHeartbeat" : ISODate("2022-01-28T08:20:27.409Z"),
                        "lastHeartbeatRecv" : ISODate("2022-01-28T08:09:12.995Z"),
                        "pingMs" : NumberLong(0),
                        "lastHeartbeatMessage" : "Error connecting to 172.16.0.104:27017 :: caused by :: Connection refused",
                        "syncingTo" : "",
                        "syncSourceHost" : "",
                        "syncSourceId" : -1,
                        "infoMessage" : "",
                        "configVersion" : -1
                },
                {
                        "_id" : 2,
                        "name" : "172.16.0.105:27017",
                        "health" : 0,
                        "state" : 8,
                        "stateStr" : "(not reachable/healthy)",
                        "uptime" : 0,
                        "optime" : {
                                "ts" : Timestamp(0, 0),
                                "t" : NumberLong(-1)
                        },
                        "optimeDurable" : {
                                "ts" : Timestamp(0, 0),
                                "t" : NumberLong(-1)
                        },
                        "optimeDate" : ISODate("1970-01-01T00:00:00Z"),
                        "optimeDurableDate" : ISODate("1970-01-01T00:00:00Z"),
                        "lastHeartbeat" : ISODate("2022-01-28T08:20:27.409Z"),
                        "lastHeartbeatRecv" : ISODate("2022-01-28T08:10:35.054Z"),
                        "pingMs" : NumberLong(0),
                        "lastHeartbeatMessage" : "Error connecting to 172.16.0.105:27017 :: caused by :: Connection refused",
                        "syncingTo" : "",
                        "syncSourceHost" : "",
                        "syncSourceId" : -1,
                        "infoMessage" : "",
                        "configVersion" : -1
                }
        ],
        "ok" : 1,
        "operationTime" : Timestamp(1643357433, 1),
        "$clusterTime" : {
                "clusterTime" : Timestamp(1643357433, 1),
                "signature" : {
                        "hash" : BinData(0,"B1BMoumB41pS7YqPnSSjxkeyVus="),
                        "keyId" : NumberLong("7057052573554966530")
                }
        }
}

// 整个集群不可用,可读不可写
rs1:SECONDARY> db.blog.find()
{ "_id" : ObjectId("61efbd0c91b49e592d29811c"), "title" : "My First Post", "content" : "hello word!", "date" : ISODate("2022-01-25T09:03:48.287Z") }
rs1:SECONDARY> db.blog.drop()
2022-01-28T16:28:00.104+0800 E QUERY    [js] Error: drop failed: {
        "operationTime" : Timestamp(1643357433, 1),
        "ok" : 0,
        "errmsg" : "not master",
        "code" : 10107,
        "codeName" : "NotMaster",
        "$clusterTime" : {
                "clusterTime" : Timestamp(1643357433, 1),
                "signature" : {
                        "hash" : BinData(0,"B1BMoumB41pS7YqPnSSjxkeyVus="),
                        "keyId" : NumberLong("7057052573554966530")
                }
        }
} :
_getErrorWithCode@src/mongo/shell/utils.js:25:13
DBCollection.prototype.drop@src/mongo/shell/collection.js:707:1
@(shell):1:1

// 手动强制处理
rs1:SECONDARY> cfg = {_id:"rs1",members:[{_id:0,host:"172.16.0.103:27017"}]}
{
        "_id" : "rs1",
        "members" : [
                {
                        "_id" : 0,
                        "host" : "172.16.0.103:27017"
                }
        ]
}


// 因为版本问题,出现错误,cfg必须要添加protocolVersion参数
rs1:SECONDARY> rs.reconfig(cfg,{force:true})
{
        "operationTime" : Timestamp(1643357433, 1),
        "ok" : 0,
        "errmsg" : "Support for replication protocol version 0 was removed in MongoDB 4.0. Downgrade to MongoDB version 3.6 and upgrade your protocol version to 1 before upgrading your MongoDB version",
        "code" : 103,
        "codeName" : "NewReplicaSetConfigurationIncompatible",
        "$clusterTime" : {
                "clusterTime" : Timestamp(1643357433, 1),
                "signature" : {
                        "hash" : BinData(0,"B1BMoumB41pS7YqPnSSjxkeyVus="),
                        "keyId" : NumberLong("7057052573554966530")
                }
        }
}

// 重新配置
rs1:SECONDARY> cfg = {_id:"rs1","protocolVersion":1,members:[{_id:0,host:"172.16.0.103:27017"}]}
{
        "_id" : "rs1",
        "protocolVersion" : 1,
        "members" : [
                {
                        "_id" : 0,
                        "host" : "172.16.0.103:27017"
                }
        ]
}

rs1:SECONDARY> rs.reconfig(cfg,{force:true})
{
        "ok" : 1,
        "operationTime" : Timestamp(1643357433, 1),
        "$clusterTime" : {
                "clusterTime" : Timestamp(1643357433, 1),
                "signature" : {
                        "hash" : BinData(0,"B1BMoumB41pS7YqPnSSjxkeyVus="),
                        "keyId" : NumberLong("7057052573554966530")
                }
        }
}

// 重新查看副本集信息,发现其他两个节点都被移除
rs1:PRIMARY> rs.status()
{
        "set" : "rs1",
        "date" : ISODate("2022-01-28T08:48:34.026Z"),
        "myState" : 1,
        "term" : NumberLong(17),
        "syncingTo" : "",
        "syncSourceHost" : "",
        "syncSourceId" : -1,
        "heartbeatIntervalMillis" : NumberLong(2000),
        "optimes" : {
                "lastCommittedOpTime" : {
                        "ts" : Timestamp(1643359710, 1),
                        "t" : NumberLong(17)
                },
                "readConcernMajorityOpTime" : {
                        "ts" : Timestamp(1643359710, 1),
                        "t" : NumberLong(17)
                },
                "appliedOpTime" : {
                        "ts" : Timestamp(1643359710, 1),
                        "t" : NumberLong(17)
                },
                "durableOpTime" : {
                        "ts" : Timestamp(1643359710, 1),
                        "t" : NumberLong(17)
                }
        },
        "lastStableCheckpointTimestamp" : Timestamp(1643359670, 1),
        "members" : [
                {
                        "_id" : 0,
                        "name" : "172.16.0.103:27017",
                        "health" : 1,
                        "state" : 1,
                        "stateStr" : "PRIMARY",
                        "uptime" : 199639,
                        "optime" : {
                                "ts" : Timestamp(1643359710, 1),
                                "t" : NumberLong(17)
                        },
                        "optimeDate" : ISODate("2022-01-28T08:48:30Z"),
                        "syncingTo" : "",
                        "syncSourceHost" : "",
                        "syncSourceId" : -1,
                        "infoMessage" : "",
                        "electionTime" : Timestamp(1643359588, 1),
                        "electionDate" : ISODate("2022-01-28T08:46:28Z"),
                        "configVersion" : 68772,
                        "self" : true,
                        "lastHeartbeatMessage" : ""
                }
        ],
        "ok" : 1,
        "operationTime" : Timestamp(1643359710, 1),
        "$clusterTime" : {
                "clusterTime" : Timestamp(1643359710, 1),
                "signature" : {
                        "hash" : BinData(0,"qrRiYIjIE4ORu5biebPC9m0HzDA="),
                        "keyId" : NumberLong("7057052573554966530")
                }
        }
}