副本集或者复制集介绍:
•副本集在mongodb中是是一组 mongod保持相同的数据集过程,副本集提供冗余和高可用性,并且是所有生产部署的基础。 •复制提供冗余并增加数据可用性,在不用数据库服务器上具有多个数据副本是,复制可以提供一个级别的单一数据库服务器丢失的容错能力。 •副本集可以支撑更高的读操作,因为客户端可以向不同的服务器发送读取操作,可以配置在不同的数据中心用作遭难恢复或者报告,备份。 副本集成员最多50个,只有7个成员可以参与选举投票,多中心容灾能力,自动恢复,滚动式升级服务
常见的复制集
线上环境常见的架构为副本集,可以理解为一主多从。
下图:1主2从
下图:一主一从一仲裁
服务器信息:
三台机器一样配置2核16G内存 存储盘100G
"host" : "10.1.1.159:27020" "host" : "10.1.1.77:27020" "host" : "10.1.1.178:27020
1、我们在其中一台机器配置:
[root@10-1-1-159 ~]# wget https://fastdl.mongodb.org/linux/mongodb-linux-x86_64-rhel70-4.2.1.tgz [root@10-1-1-159 ~]# tar -zxvf mongodb-linux-x86_64-rhel70-4.2.1.tgz -C /data/ [root@10-1-1-159 ~]# mkdir /data/mongodb/{data,logs,pid,conf} -p 配置文件:
[root@10-1-1-159 ~]# cat /data/mongodb/conf/mongodb.conf
systemLog:
destination: file
logAppend: true
path: /data/mongodb/logs/mongod.log
storage:
dbPath: /data/mongodb/data
journal:
enabled: true
directoryPerDB: true
wiredTiger:
engineConfig:
cacheSizeGB: 8 #如果一台机器启动一个实例这个可以注释选择默认,如果一台机器启动多个实例,需要设置内存大小,避免互相抢占内存
directoryForIndexes: true
processManagement:
fork: true
pidFilePath: /data/mongodb/pid/mongod.pid
net:
port: 27020
bindIp: 10.1.1.159,localhost #修改为本机IP地址
maxIncomingConnections: 5000
#security:
#keyFile: /data/mongodb/conf/keyfile
#authorization: enabled
replication:
# oplogSizeMB: 1024
replSetName: rs02
2、将配置负复制到其他机器:
[root@10-1-1-159 ~]# scp -r /data/* root@10.1.1.77:/data/ [root@10-1-1-159 ~]# scp -r /data/* root@10.1.1.178:/data/
目录结构:
[root@10-1-1-178 data]# tree mongodb
mongodb
├── conf
│ └── mongodb.conf
├── data
├── logs
└── pid
3、三台机器分别执行:
groupadd mongod useradd -g mongod mongod yum install -y libcurl openssl glibc cd /data ln -s mongodb-linux-x86_64-rhel70-4.2.1 mongodb-4.2.1 chown -R mongod.mongod /data sudo -u mongod /data/mongodb-4.2.1/bin/mongod -f /data/mongodb/conf/mongodb.conf
配置复制集: #副本集名称rs02和配置文件中replSetName保持一致
config = { _id:"rs02", members:[
{_id:0,host:"10.1.1.159:27010",priority:90}, {_id:1,host:"10.1.1.77:27010",priority:90}, {_id:2,host:"10.1.1.178:27010",arbiterOnly:true} ] }
#初始化
rs.initiate(config);
4、在其中一台机器执行:
[root@10-1-1-159 ~]# /data/mongodb-4.2.1/bin/mongo 10.1.1.159:27020
> use admin
switched to db admin
> config = { _id:"rs02", members:[
... {_id:0,host:"10.1.1.159:27020",priority:90},
... {_id:1,host:"10.1.1.77:27020",priority:90},
... {_id:2,host:"10.1.1.178:27020",arbiterOnly:true}
... ]
... }
{
"_id" : "rs02",
"members" : [
{
"_id" : 0,
"host" : "10.1.1.159:27020",
"priority" : 90
},
{
"_id" : 1,
"host" : "10.1.1.77:27020",
"priority" : 90
},
{
"_id" : 2,
"host" : "10.1.1.178:27020",
"arbiterOnly" : true
}
]
}
>
> rs.initiate(config); 初始化副本集########eeeerrrr
{
"ok" : 1,
"operationTime" : Timestamp(1583907929, 1),
"$clusterTime" : {
"clusterTime" : Timestamp(1583907929, 1),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
}
}
5、查看节点状态
rs02:PRIMARY> rs.status()
{
"set" : "rs02",
"date" : ISODate("2020-03-13T07:11:09.427Z"),
"myState" : 1,
"term" : NumberLong(1),
"syncingTo" : "",
"syncSourceHost" : "",
"syncSourceId" : -1,
"heartbeatIntervalMillis" : NumberLong(2000),
"optimes" : {
"lastCommittedOpTime" : {
"ts" : Timestamp(1584083465, 1),
"t" : NumberLong(1)
},
"readConcernMajorityOpTime" : {
"ts" : Timestamp(1584083465, 1),
"t" : NumberLong(1)
},
"appliedOpTime" : {
"ts" : Timestamp(1584083465, 1),
"t" : NumberLong(1)
},
"durableOpTime" : {
"ts" : Timestamp(1584083465, 1),
"t" : NumberLong(1)
}
},
"members" : [
{
"_id" : 0,
"name" : "10.1.1.159:27020",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY", #主节点
"uptime" : 185477,
"optime" : {
"ts" : Timestamp(1584083465, 1),
"t" : NumberLong(1)
},
"optimeDate" : ISODate("2020-03-13T07:11:05Z"),
"syncingTo" : "",
"syncSourceHost" : "",
"syncSourceId" : -1,
"infoMessage" : "",
"electionTime" : Timestamp(1583907939, 1),
"electionDate" : ISODate("2020-03-11T06:25:39Z"),
"configVersion" : 1,
"self" : true,
"lastHeartbeatMessage" : ""
},
{
"_id" : 1,
"name" : "10.1.1.77:27020",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY", #从节点
"uptime" : 175540,
"optime" : {
"ts" : Timestamp(1584083465, 1),
"t" : NumberLong(1)
},
"optimeDurable" : {
"ts" : Timestamp(1584083465, 1),
"t" : NumberLong(1)
},
"optimeDate" : ISODate("2020-03-13T07:11:05Z"),
"optimeDurableDate" : ISODate("2020-03-13T07:11:05Z"),
"lastHeartbeat" : ISODate("2020-03-13T07:11:08.712Z"),
"lastHeartbeatRecv" : ISODate("2020-03-13T07:11:08.711Z"),
"pingMs" : NumberLong(0),
"lastHeartbeatMessage" : "",
"syncingTo" : "10.1.1.159:27020",
"syncSourceHost" : "10.1.1.159:27020",
"syncSourceId" : 0,
"infoMessage" : "",
"configVersion" : 1
},
{
"_id" : 2,
"name" : "10.1.1.178:27020",
"health" : 1,
"state" : 7,
"stateStr" : "ARBITER", #仲裁节点
"uptime" : 175540,
"lastHeartbeat" : ISODate("2020-03-13T07:11:08.712Z"),
"lastHeartbeatRecv" : ISODate("2020-03-13T07:11:08.711Z"),
"pingMs" : NumberLong(0),
"lastHeartbeatMessage" : "",
"syncingTo" : "",
"syncSourceHost" : "",
"syncSourceId" : -1,
"infoMessage" : "",
"configVersion" : 1
}
],
"ok" : 1,
"operationTime" : Timestamp(1584083465, 1),
"$clusterTime" : {
"clusterTime" : Timestamp(1584083465, 1),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
}
}
rs02:PRIMARY>
7、现在副本集状态:
10.1.1.178:27020 ARBITER 仲裁节点 10.1.1.77:27020 SECONDARY 从节点 10.1.1.159:27020 PRIMARY 主节点
我们插入一些数据查,然后将主节点停掉,
仲裁节点的日志 我们可以看到,当节点10.1.1.159宕机以后,重新选举了:Member 10.1.1.77:27010 is now in state PRIMARY
2020-03-18T14:34:53.636+0800 I NETWORK [conn9] end connection 10.1.1.159:49160 (1 connection now open)
2020-03-18T14:34:54.465+0800 I CONNPOOL [Replication] dropping unhealthy pooled connection to 10.1.1.159:27010
2020-03-18T14:34:54.465+0800 I CONNPOOL [Replication] after drop, pool was empty, going to spawn some connections
2020-03-18T14:34:54.465+0800 I ASIO [Replication] Connecting to 10.1.1.159:27010
......
2020-03-18T14:35:02.473+0800 I ASIO [Replication] Failed to connect to 10.1.1.159:27010 - HostUnreachable: Error connecting to 10.1.1.159:27010 :: caused by :: Connection refused
2020-03-18T14:35:02.473+0800 I CONNPOOL [Replication] Dropping all pooled connections to 10.1.1.159:27010 due to HostUnreachable: Error connecting to 10.1.1.159:27010 :: caused by :: Connection refused
2020-03-18T14:35:02.473+0800 I REPL_HB [replexec-8] Error in heartbeat (requestId: 662) to 10.1.1.159:27010, response status: HostUnreachable: Error connecting to 10.1.1.159:27010 :: caused by :: Connection refused
2020-03-18T14:35:04.463+0800 I REPL [replexec-5] Member 10.1.1.77:27010 is now in state PRIMARY
2020-03-18T14:35:04.473+0800 I ASIO [Replication] Connecting to 10.1.1.159:27010
2020-03-18T14:35:04.473+0800 I ASIO [Replication] Failed to connect to 10.1.1.159:27010 - HostUnreachable: Error connecting to 10.1.1.159:27010 :: caused by :: Connection refused
2020-03-18T14:35:04.473+0800 I CONNPOOL [Replication] Dropping all pooled connections to 10.1.1.159:27010 due to HostUnreachable: Error connecting to 10.1.1.159:27010 :: caused by :: Connection refused
架构也就变成了下图:
目前副本集搭建完成,也测试了当一个节点出现问题以后(至少三个节点),并不会影响服务正常读写。线上环境我们需要开启认证, 下一章我们开始添加用户,开启认证授权。