一.primary节点转为secondary节点
1.优先级设置
- 每个节点的优先级默认为1
// 在primary节点上设置优先级
cfg = rs.conf()
// 将某个节点priority设置为最大的时候,通过rs.reconfig()后会自动选举次节点为primary
cfg.members[1].priority = 10
rs1:PRIMARY> rs.reconfig(cfg)
{
"ok" : 1,
"operationTime" : Timestamp(1643180119, 1),
"$clusterTime" : {
"clusterTime" : Timestamp(1643180119, 1),
"signature" : {
"hash" : BinData(0,"pro15qiUTk5MzcyGRJ+g9e5Sc30="),
"keyId" : NumberLong("7057052573554966530")
}
}
}
2.强制将primary退化为secondary节点
- rs.stepDown(stepDownSecs, secondaryCatchUpPeriodSecs):在seconds时间内,这个实例不会把自己选为primary角色
- 所有节点正常,且优先级不一样的情况下,将强制降级成功之后,又会重新检测,重新选举节点正常且优先级大的为primary节点,也就是说会自动切回
- 在节点可选举为primary中且优先级一样的情况下执行rs.stepDown()则不会自动回切
- 使用rs.stepDown()不可以指定切换,可以结合rs.freeze(seconds)控制选举的哪个为primary节点
-- 优先级:10,5,3
-- 在当前有优先级为10的节点执行rs.stepDown()
rs1:PRIMARY> rs.stepDown(10)
2022-01-27T09:45:21.007+0800 E QUERY [js] Error: error doing query: failed: network error while attempting to run command 'replSetStepDown' on host '172.16.0.104:27017' :
DB.prototype.runCommand@src/mongo/shell/db.js:168:1
DB.prototype.adminCommand@src/mongo/shell/db.js:185:1
rs.stepDown@src/mongo/shell/utils.js:1433:12
@(shell):1:1
2022-01-27T09:45:21.009+0800 I NETWORK [js] trying reconnect to 172.16.0.104:27017 failed
2022-01-27T09:45:21.009+0800 I NETWORK [js] reconnect 172.16.0.104:27017 ok
rs1:SECONDARY>
-- 重新查看rs.status发现primary重新切回
{
"_id" : 1,
"name" : "172.16.0.104:27017",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 87875,
"optime" : {
"ts" : Timestamp(1643247953, 1),
"t" : NumberLong(14)
},
"optimeDate" : ISODate("2022-01-27T01:45:53Z"),
"syncingTo" : "",
"syncSourceHost" : "",
"syncSourceId" : -1,
"infoMessage" : "",
"electionTime" : Timestamp(1643247931, 1),
"electionDate" : ISODate("2022-01-27T01:45:31Z"),
"configVersion" : 9,
"self" : true,
"lastHeartbeatMessage" : ""
}
// 利用rs.freeze()冻结某节点在指定时间内不可以选举为primary
// 当前所有节点优先级都为1,指定secondary节点C为primary
// 1)在secondary节点A中执行
rs1:SECONDARY> rs.freeze(300)
{
"ok" : 1,
"operationTime" : Timestamp(1643248827, 1),
"$clusterTime" : {
"clusterTime" : Timestamp(1643248827, 1),
"signature" : {
"hash" : BinData(0,"ISguX2OearE9kZfo+2CHdIF+75c="),
"keyId" : NumberLong("7057052573554966530")
}
}
}
// 2)在primary节点B中执行
rs1:PRIMARY> rs.stepDown()
2022-01-27T10:00:40.991+0800 E QUERY [js] Error: error doing query: failed: network error while attempting to run command 'replSetStepDown' on host '172.16.0.103:27017' :
DB.prototype.runCommand@src/mongo/shell/db.js:168:1
DB.prototype.adminCommand@src/mongo/shell/db.js:185:1
rs.stepDown@src/mongo/shell/utils.js:1433:12
@(shell):1:1
2022-01-27T10:00:40.994+0800 I NETWORK [js] trying reconnect to 172.16.0.103:27017 failed
2022-01-27T10:00:40.994+0800 I NETWORK [js] reconnect 172.16.0.103:27017 ok
3.强制secondary节点进入维护模式
- secondary节点进入维护模式之后,转态转化为recovering,在这个状态下的节点,客户端不会发送读请求给它,同时也 不能作为复制源
- 维护模式的触发:自动触发和手动触发
// 手动触发
rs1:SECONDARY> db.adminCommand({"replSetMaintenance":true})
{
"ok" : 1,
"operationTime" : Timestamp(1643262934, 1),
"$clusterTime" : {
"clusterTime" : Timestamp(1643262934, 1),
"signature" : {
"hash" : BinData(0,"h27nqPg/e02vLq3I/JokDrvwMXM="),
"keyId" : NumberLong("7057052573554966530")
}
}
}
// 处于RECOVERING状态
rs1:RECOVERING> use test
switched to db test
rs1:RECOVERING> db.blog.find()
Error: error: {
"operationTime" : Timestamp(1643262954, 1),
"ok" : 0,
"errmsg" : "not master and slaveOk=false",
"code" : 13435,
"codeName" : "NotMasterNoSlaveOk",
"$clusterTime" : {
"clusterTime" : Timestamp(1643262954, 1),
"signature" : {
"hash" : BinData(0,"lyYKWv91hsfUWLCvdqdSDB0BNZw="),
"keyId" : NumberLong("7057052573554966530")
}
}
}
二.mongodb副本集的同步
1.强制当前成员从指定成员同步数据
- 默认都是从primary节点同步到各个secondary节点,通过rs.status()中syncingTo字段信息可以查看节点同步来源
- 通过rs.syncFrom(“host:port”)可以指定当前节点同步数据的来源
rs1:SECONDARY> rs.syncFrom("172.16.0.103:27017")
{
"syncFromRequested" : "172.16.0.103:27017",
"prevSyncTarget" : "172.16.0.105:27017",
"ok" : 1,
"operationTime" : Timestamp(1643263354, 1),
"$clusterTime" : {
"clusterTime" : Timestamp(1643263354, 1),
"signature" : {
"hash" : BinData(0,"W/DndMPHNfHr1nyZiTFLOhKk1gs="),
"keyId" : NumberLong("7057052573554966530")
}
}
}
// 查看同步来源
rs1:SECONDARY> rs.status().members[1].syncingTo
172.16.0.103:27017
2.禁用链式复制
- 2.0版本开始支持链式复制,默认开启,是根据secondary节点之间的ping time和网络距离最近选择那个secondary作为数据同步的节点
- 可以减少primary的资源消耗,较少负载,但是会增加节点之间的数据同步延迟
// 设置是否开启链式复制
cfg = rs.config()
cfg.setting.chainingAllowed = false|true
rs.reconfig(cfg)
3.数据同步的方式
- initial sync:初始化,也可以理解为全量同步。
- initial sync会在为每个集合复制文档时构建所有集合索引,3.4之前版本进_id在此阶段构建索引
3.4之后initial sync复制数据的时间会将新增的oplog记录存到本地
1)oplog为空,如新节点加入
2)local.replset.minvalid集合_initialSyncFlog字段设置为true(用于init sync失败处理)
3)内存标记initialSyncRequest设置为true(用于resync命令,resync命令只用于master/slave架构,副本集无法使用) - replication:sync oplog,不断重放primary的oplog同步增量数据
- producer thread:负责不断的从同步源上拉取oplog,并加入一个BlockQueue的队列里保存着,BlockQueue最大存储240MB的oplog数据,当超过这个阈值时,必须等到oplog被replBatcher消费掉才能继续拉取
- replBatcher thread:负责逐个从producer thread的队列里面取出oplog,并放到自己维护的队列里, 这个队列最多允许5000个元素,并且元素总大小不超过512MB,当队列满了时,就需要等待oplogApplication消费掉
- oplogApplication:负责取出replBatcher thread当前队列的所有元素,并将元素根据docld(如果存储不支持文档锁,则根据集合名称)分散到不通过的replWriter线程,replWriter线程所有的oplog应用到自身,等待所有oplog都应用完毕,oplogApplication线程将所有的oplog顺序写入到local.oplog.rs集合
// producer的buffer和apply线程统计信息查询
db.serverStatus().metrics.repl
三.机器宕机,只剩一个secondary节点时,强制secondary为primary
- 三节点的副本集,任何一台故障,集群都会进行自动的切换,不影响的服务
- 三节点的副本集,故障任何两个节点,集群会变的不可用,需要手动处理
// 机器宕机一个secondary和primary不可用
rs1:SECONDARY> rs.status()
{
"set" : "rs1",
"date" : ISODate("2022-01-28T08:20:27.447Z"),
"myState" : 2,
"term" : NumberLong(16),
"syncingTo" : "",
"syncSourceHost" : "",
"syncSourceId" : -1,
"heartbeatIntervalMillis" : NumberLong(2000),
"optimes" : {
"lastCommittedOpTime" : {
"ts" : Timestamp(1643357433, 1),
"t" : NumberLong(16)
},
"readConcernMajorityOpTime" : {
"ts" : Timestamp(1643357433, 1),
"t" : NumberLong(16)
},
"appliedOpTime" : {
"ts" : Timestamp(1643357433, 1),
"t" : NumberLong(16)
},
"durableOpTime" : {
"ts" : Timestamp(1643357433, 1),
"t" : NumberLong(16)
}
},
"lastStableCheckpointTimestamp" : Timestamp(1643357433, 1),
"members" : [
{
"_id" : 0,
"name" : "172.16.0.103:27017",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 197952,
"optime" : {
"ts" : Timestamp(1643357433, 1),
"t" : NumberLong(16)
},
"optimeDate" : ISODate("2022-01-28T08:10:33Z"),
"syncingTo" : "",
"syncSourceHost" : "",
"syncSourceId" : -1,
"infoMessage" : "could not find member to sync from",
"configVersion" : 10,
"self" : true,
"lastHeartbeatMessage" : ""
},
{
"_id" : 1,
"name" : "172.16.0.104:27017",
"health" : 0,
"state" : 8,
"stateStr" : "(not reachable/healthy)",
"uptime" : 0,
"optime" : {
"ts" : Timestamp(0, 0),
"t" : NumberLong(-1)
},
"optimeDurable" : {
"ts" : Timestamp(0, 0),
"t" : NumberLong(-1)
},
"optimeDate" : ISODate("1970-01-01T00:00:00Z"),
"optimeDurableDate" : ISODate("1970-01-01T00:00:00Z"),
"lastHeartbeat" : ISODate("2022-01-28T08:20:27.409Z"),
"lastHeartbeatRecv" : ISODate("2022-01-28T08:09:12.995Z"),
"pingMs" : NumberLong(0),
"lastHeartbeatMessage" : "Error connecting to 172.16.0.104:27017 :: caused by :: Connection refused",
"syncingTo" : "",
"syncSourceHost" : "",
"syncSourceId" : -1,
"infoMessage" : "",
"configVersion" : -1
},
{
"_id" : 2,
"name" : "172.16.0.105:27017",
"health" : 0,
"state" : 8,
"stateStr" : "(not reachable/healthy)",
"uptime" : 0,
"optime" : {
"ts" : Timestamp(0, 0),
"t" : NumberLong(-1)
},
"optimeDurable" : {
"ts" : Timestamp(0, 0),
"t" : NumberLong(-1)
},
"optimeDate" : ISODate("1970-01-01T00:00:00Z"),
"optimeDurableDate" : ISODate("1970-01-01T00:00:00Z"),
"lastHeartbeat" : ISODate("2022-01-28T08:20:27.409Z"),
"lastHeartbeatRecv" : ISODate("2022-01-28T08:10:35.054Z"),
"pingMs" : NumberLong(0),
"lastHeartbeatMessage" : "Error connecting to 172.16.0.105:27017 :: caused by :: Connection refused",
"syncingTo" : "",
"syncSourceHost" : "",
"syncSourceId" : -1,
"infoMessage" : "",
"configVersion" : -1
}
],
"ok" : 1,
"operationTime" : Timestamp(1643357433, 1),
"$clusterTime" : {
"clusterTime" : Timestamp(1643357433, 1),
"signature" : {
"hash" : BinData(0,"B1BMoumB41pS7YqPnSSjxkeyVus="),
"keyId" : NumberLong("7057052573554966530")
}
}
}
// 整个集群不可用,可读不可写
rs1:SECONDARY> db.blog.find()
{ "_id" : ObjectId("61efbd0c91b49e592d29811c"), "title" : "My First Post", "content" : "hello word!", "date" : ISODate("2022-01-25T09:03:48.287Z") }
rs1:SECONDARY> db.blog.drop()
2022-01-28T16:28:00.104+0800 E QUERY [js] Error: drop failed: {
"operationTime" : Timestamp(1643357433, 1),
"ok" : 0,
"errmsg" : "not master",
"code" : 10107,
"codeName" : "NotMaster",
"$clusterTime" : {
"clusterTime" : Timestamp(1643357433, 1),
"signature" : {
"hash" : BinData(0,"B1BMoumB41pS7YqPnSSjxkeyVus="),
"keyId" : NumberLong("7057052573554966530")
}
}
} :
_getErrorWithCode@src/mongo/shell/utils.js:25:13
DBCollection.prototype.drop@src/mongo/shell/collection.js:707:1
@(shell):1:1
// 手动强制处理
rs1:SECONDARY> cfg = {_id:"rs1",members:[{_id:0,host:"172.16.0.103:27017"}]}
{
"_id" : "rs1",
"members" : [
{
"_id" : 0,
"host" : "172.16.0.103:27017"
}
]
}
// 因为版本问题,出现错误,cfg必须要添加protocolVersion参数
rs1:SECONDARY> rs.reconfig(cfg,{force:true})
{
"operationTime" : Timestamp(1643357433, 1),
"ok" : 0,
"errmsg" : "Support for replication protocol version 0 was removed in MongoDB 4.0. Downgrade to MongoDB version 3.6 and upgrade your protocol version to 1 before upgrading your MongoDB version",
"code" : 103,
"codeName" : "NewReplicaSetConfigurationIncompatible",
"$clusterTime" : {
"clusterTime" : Timestamp(1643357433, 1),
"signature" : {
"hash" : BinData(0,"B1BMoumB41pS7YqPnSSjxkeyVus="),
"keyId" : NumberLong("7057052573554966530")
}
}
}
// 重新配置
rs1:SECONDARY> cfg = {_id:"rs1","protocolVersion":1,members:[{_id:0,host:"172.16.0.103:27017"}]}
{
"_id" : "rs1",
"protocolVersion" : 1,
"members" : [
{
"_id" : 0,
"host" : "172.16.0.103:27017"
}
]
}
rs1:SECONDARY> rs.reconfig(cfg,{force:true})
{
"ok" : 1,
"operationTime" : Timestamp(1643357433, 1),
"$clusterTime" : {
"clusterTime" : Timestamp(1643357433, 1),
"signature" : {
"hash" : BinData(0,"B1BMoumB41pS7YqPnSSjxkeyVus="),
"keyId" : NumberLong("7057052573554966530")
}
}
}
// 重新查看副本集信息,发现其他两个节点都被移除
rs1:PRIMARY> rs.status()
{
"set" : "rs1",
"date" : ISODate("2022-01-28T08:48:34.026Z"),
"myState" : 1,
"term" : NumberLong(17),
"syncingTo" : "",
"syncSourceHost" : "",
"syncSourceId" : -1,
"heartbeatIntervalMillis" : NumberLong(2000),
"optimes" : {
"lastCommittedOpTime" : {
"ts" : Timestamp(1643359710, 1),
"t" : NumberLong(17)
},
"readConcernMajorityOpTime" : {
"ts" : Timestamp(1643359710, 1),
"t" : NumberLong(17)
},
"appliedOpTime" : {
"ts" : Timestamp(1643359710, 1),
"t" : NumberLong(17)
},
"durableOpTime" : {
"ts" : Timestamp(1643359710, 1),
"t" : NumberLong(17)
}
},
"lastStableCheckpointTimestamp" : Timestamp(1643359670, 1),
"members" : [
{
"_id" : 0,
"name" : "172.16.0.103:27017",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 199639,
"optime" : {
"ts" : Timestamp(1643359710, 1),
"t" : NumberLong(17)
},
"optimeDate" : ISODate("2022-01-28T08:48:30Z"),
"syncingTo" : "",
"syncSourceHost" : "",
"syncSourceId" : -1,
"infoMessage" : "",
"electionTime" : Timestamp(1643359588, 1),
"electionDate" : ISODate("2022-01-28T08:46:28Z"),
"configVersion" : 68772,
"self" : true,
"lastHeartbeatMessage" : ""
}
],
"ok" : 1,
"operationTime" : Timestamp(1643359710, 1),
"$clusterTime" : {
"clusterTime" : Timestamp(1643359710, 1),
"signature" : {
"hash" : BinData(0,"qrRiYIjIE4ORu5biebPC9m0HzDA="),
"keyId" : NumberLong("7057052573554966530")
}
}
}