副本集或者复制集介绍:

•副本集在mongodb中是是一组 mongod保持相同的数据集过程,副本集提供冗余和高可用性,并且是所有生产部署的基础。 •复制提供冗余并增加数据可用性,在不用数据库服务器上具有多个数据副本是,复制可以提供一个级别的单一数据库服务器丢失的容错能力。 •副本集可以支撑更高的读操作,因为客户端可以向不同的服务器发送读取操作,可以配置在不同的数据中心用作遭难恢复或者报告,备份。 副本集成员最多50个,只有7个成员可以参与选举投票,多中心容灾能力,自动恢复,滚动式升级服务

常见的复制集

线上环境常见的架构为副本集,可以理解为一主多从。

下图:1主2从

下图:一主一从一仲裁

服务器信息:

三台机器一样配置2核16G内存 存储盘100G

"host" : "10.1.1.159:27020" "host" : "10.1.1.77:27020" "host" : "10.1.1.178:27020

1、我们在其中一台机器配置:

[root@10-1-1-159 ~]# wget https://fastdl.mongodb.org/linux/mongodb-linux-x86_64-rhel70-4.2.1.tgz [root@10-1-1-159 ~]# tar -zxvf mongodb-linux-x86_64-rhel70-4.2.1.tgz -C /data/ [root@10-1-1-159 ~]# mkdir /data/mongodb/{data,logs,pid,conf} -p 配置文件:

[root@10-1-1-159 ~]# cat /data/mongodb/conf/mongodb.conf
systemLog:
  destination: file
  logAppend: true
  path: /data/mongodb/logs/mongod.log

storage:
  dbPath: /data/mongodb/data
  journal:
    enabled: true
  directoryPerDB: true
  wiredTiger:
     engineConfig:
        cacheSizeGB: 8                    #如果一台机器启动一个实例这个可以注释选择默认,如果一台机器启动多个实例,需要设置内存大小,避免互相抢占内存
        directoryForIndexes: true

processManagement:
  fork: true
  pidFilePath: /data/mongodb/pid/mongod.pid

net:
  port: 27020
  bindIp: 10.1.1.159,localhost      #修改为本机IP地址
  maxIncomingConnections: 5000

#security:
  #keyFile: /data/mongodb/conf/keyfile
  #authorization: enabled
replication:
#   oplogSizeMB: 1024
   replSetName: rs02

2、将配置负复制到其他机器:

[root@10-1-1-159 ~]# scp -r /data/* root@10.1.1.77:/data/ [root@10-1-1-159 ~]# scp -r /data/* root@10.1.1.178:/data/

目录结构:

[root@10-1-1-178 data]# tree mongodb
mongodb
├── conf
│   └── mongodb.conf
├── data
├── logs
└── pid

3、三台机器分别执行:

groupadd mongod useradd -g mongod mongod yum install -y libcurl openssl glibc cd /data ln -s mongodb-linux-x86_64-rhel70-4.2.1 mongodb-4.2.1 chown -R mongod.mongod /data sudo -u mongod /data/mongodb-4.2.1/bin/mongod -f /data/mongodb/conf/mongodb.conf

配置复制集: #副本集名称rs02和配置文件中replSetName保持一致

config = { _id:"rs02", members:[
                     {_id:0,host:"10.1.1.159:27010",priority:90},                       {_id:1,host:"10.1.1.77:27010",priority:90},                      {_id:2,host:"10.1.1.178:27010",arbiterOnly:true}     ] }

#初始化

rs.initiate(config); 

4、在其中一台机器执行:

[root@10-1-1-159 ~]# /data/mongodb-4.2.1/bin/mongo  10.1.1.159:27020
> use admin
switched to db admin
> config = { _id:"rs02", members:[
...                      {_id:0,host:"10.1.1.159:27020",priority:90},
...                      {_id:1,host:"10.1.1.77:27020",priority:90},
...                      {_id:2,host:"10.1.1.178:27020",arbiterOnly:true}
...               ]
... }
{
	"_id" : "rs02",
	"members" : [
		{
			"_id" : 0,
			"host" : "10.1.1.159:27020",
			"priority" : 90
		},
		{
			"_id" : 1,
			"host" : "10.1.1.77:27020",
			"priority" : 90
		},
		{
			"_id" : 2,
			"host" : "10.1.1.178:27020",
			"arbiterOnly" : true
		}
	]
}
>
> rs.initiate(config);        初始化副本集########eeeerrrr  
{
	"ok" : 1,
	"operationTime" : Timestamp(1583907929, 1),
	"$clusterTime" : {
		"clusterTime" : Timestamp(1583907929, 1),
		"signature" : {
			"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
			"keyId" : NumberLong(0)
		}
	}
}

5、查看节点状态

rs02:PRIMARY> rs.status()
{
	"set" : "rs02",
	"date" : ISODate("2020-03-13T07:11:09.427Z"),
	"myState" : 1,
	"term" : NumberLong(1),
	"syncingTo" : "",
	"syncSourceHost" : "",
	"syncSourceId" : -1,
	"heartbeatIntervalMillis" : NumberLong(2000),
	"optimes" : {
		"lastCommittedOpTime" : {
			"ts" : Timestamp(1584083465, 1),
			"t" : NumberLong(1)
		},
		"readConcernMajorityOpTime" : {
			"ts" : Timestamp(1584083465, 1),
			"t" : NumberLong(1)
		},
		"appliedOpTime" : {
			"ts" : Timestamp(1584083465, 1),
			"t" : NumberLong(1)
		},
		"durableOpTime" : {
			"ts" : Timestamp(1584083465, 1),
			"t" : NumberLong(1)
		}
	},
	"members" : [
		{
			"_id" : 0,
			"name" : "10.1.1.159:27020",
			"health" : 1,
			"state" : 1,
			"stateStr" : "PRIMARY",    #主节点
			"uptime" : 185477,
			"optime" : {
				"ts" : Timestamp(1584083465, 1),
				"t" : NumberLong(1)
			},
			"optimeDate" : ISODate("2020-03-13T07:11:05Z"),
			"syncingTo" : "",
			"syncSourceHost" : "",
			"syncSourceId" : -1,
			"infoMessage" : "",
			"electionTime" : Timestamp(1583907939, 1),
			"electionDate" : ISODate("2020-03-11T06:25:39Z"),
			"configVersion" : 1,
			"self" : true,
			"lastHeartbeatMessage" : ""
		},
		{
			"_id" : 1,
			"name" : "10.1.1.77:27020",
			"health" : 1,
			"state" : 2,
			"stateStr" : "SECONDARY",       #从节点
			"uptime" : 175540,
			"optime" : {
				"ts" : Timestamp(1584083465, 1),
				"t" : NumberLong(1)
			},
			"optimeDurable" : {
				"ts" : Timestamp(1584083465, 1),
				"t" : NumberLong(1)
			},
			"optimeDate" : ISODate("2020-03-13T07:11:05Z"),
			"optimeDurableDate" : ISODate("2020-03-13T07:11:05Z"),
			"lastHeartbeat" : ISODate("2020-03-13T07:11:08.712Z"),
			"lastHeartbeatRecv" : ISODate("2020-03-13T07:11:08.711Z"),
			"pingMs" : NumberLong(0),
			"lastHeartbeatMessage" : "",
			"syncingTo" : "10.1.1.159:27020",
			"syncSourceHost" : "10.1.1.159:27020",
			"syncSourceId" : 0,
			"infoMessage" : "",
			"configVersion" : 1
		},
		{
			"_id" : 2,
			"name" : "10.1.1.178:27020",
			"health" : 1,
			"state" : 7,
			"stateStr" : "ARBITER",     #仲裁节点
			"uptime" : 175540,
			"lastHeartbeat" : ISODate("2020-03-13T07:11:08.712Z"),
			"lastHeartbeatRecv" : ISODate("2020-03-13T07:11:08.711Z"),
			"pingMs" : NumberLong(0),
			"lastHeartbeatMessage" : "",
			"syncingTo" : "",
			"syncSourceHost" : "",
			"syncSourceId" : -1,
			"infoMessage" : "",
			"configVersion" : 1
		}
	],
	"ok" : 1,
	"operationTime" : Timestamp(1584083465, 1),
	"$clusterTime" : {
		"clusterTime" : Timestamp(1584083465, 1),
		"signature" : {
			"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
			"keyId" : NumberLong(0)
		}
	}
}
rs02:PRIMARY>

7、现在副本集状态:

10.1.1.178:27020 ARBITER 仲裁节点 10.1.1.77:27020 SECONDARY 从节点 10.1.1.159:27020 PRIMARY 主节点

我们插入一些数据查,然后将主节点停掉,

仲裁节点的日志 我们可以看到,当节点10.1.1.159宕机以后,重新选举了:Member 10.1.1.77:27010 is now in state PRIMARY

2020-03-18T14:34:53.636+0800 I NETWORK  [conn9] end connection 10.1.1.159:49160 (1 connection now open)
2020-03-18T14:34:54.465+0800 I CONNPOOL [Replication] dropping unhealthy pooled connection to 10.1.1.159:27010
2020-03-18T14:34:54.465+0800 I CONNPOOL [Replication] after drop, pool was empty, going to spawn some connections
2020-03-18T14:34:54.465+0800 I ASIO     [Replication] Connecting to 10.1.1.159:27010
......
2020-03-18T14:35:02.473+0800 I ASIO     [Replication] Failed to connect to 10.1.1.159:27010 - HostUnreachable: Error connecting to 10.1.1.159:27010 :: caused by :: Connection refused
2020-03-18T14:35:02.473+0800 I CONNPOOL [Replication] Dropping all pooled connections to 10.1.1.159:27010 due to HostUnreachable: Error connecting to 10.1.1.159:27010 :: caused by :: Connection refused
2020-03-18T14:35:02.473+0800 I REPL_HB  [replexec-8] Error in heartbeat (requestId: 662) to 10.1.1.159:27010, response status: HostUnreachable: Error connecting to 10.1.1.159:27010 :: caused by :: Connection refused
2020-03-18T14:35:04.463+0800 I REPL     [replexec-5] Member 10.1.1.77:27010 is now in state PRIMARY
2020-03-18T14:35:04.473+0800 I ASIO     [Replication] Connecting to 10.1.1.159:27010
2020-03-18T14:35:04.473+0800 I ASIO     [Replication] Failed to connect to 10.1.1.159:27010 - HostUnreachable: Error connecting to 10.1.1.159:27010 :: caused by :: Connection refused
2020-03-18T14:35:04.473+0800 I CONNPOOL [Replication] Dropping all pooled connections to 10.1.1.159:27010 due to HostUnreachable: Error connecting to 10.1.1.159:27010 :: caused by :: Connection refused

架构也就变成了下图:

目前副本集搭建完成,也测试了当一个节点出现问题以后(至少三个节点),并不会影响服务正常读写。线上环境我们需要开启认证, 下一章我们开始添加用户,开启认证授权。