1、MongoDB 主从复制

MongoDB复制是将数据同步在多个服务器的过程。

复制提供了数据的冗余备份,并在多个服务器上存储数据副本,提高了数据的可用性, 并可以保证数据的安全性。

复制还允许您从硬件故障和服务中断中恢复数据。

官方文档 https://docs.mongodb.com/manual/replication/

1.1 什么是复制?

保障数据的安全性
数据高可用性 (24*7)
灾难恢复
无需停机维护(如备份,重建索引,压缩)
分布式读取数据

1.2 MongoDB复制原理

mongodb的复制至少需要两个节点。其中一个是主节点,负责处理客户端请求,其余的都是从节点,负责复制主节点上的数据。

mongodb各个节点常见的搭配方式为:一主一从、一主多从。

主节点记录在其上的所有操作oplog,从节点定期轮询主节点获取这些操作,然后对自己的数据副本执行这些操作,从而保证从节点的数据与主节点一致。

MongoDB复制结构图如下所示:

MonggoBD架构主从 mongodb主从同步原理_服务器

以上结构图中,客户端从主节点读取数据,在客户端写入数据到主节点时, 主节点与从节点进行数据交互保障数据的一致性。

1.3 副本集特征

N 个节点的集群
任何节点可作为主节点
所有写入操作都在主节点上
自动故障转移
自动恢复

1.4 MongoDB副本集设置

1、关闭正在运行的MongoDB服务器。

service mongod stop

2.节点建点

首先需要去你选择的mongodb数据文件存放的文件夹新建三个数据库,用来模拟三台不通的机器,博主的路径如下

mkdir -p /data/db/node1
mkdir -p /data/db/node2
mkdir -p /data/db/node3

3.启动三个数据库(dbpath),并且端口(--port 1000x),集群名称(--replSet gabriel),关闭日志选项(--nojournal),守护进程方式启动,会自动拉起(--fork),日志目录(--logpath)

mongod --dbpath /data/db/node1 --port 10001 --replSet gabriel --nojournal --fork --logpath /data/db/node1.log
mongod --dbpath /data/db/node2 --port 10002 --replSet gabriel --nojournal --fork --logpath /data/db/node2.log
mongod --dbpath /data/db/node3 --port 10003 --replSet gabriel --nojournal --fork --logpath /data/db/node3.log

4.顺便连接一个服务器,做初始化操作,这里博主连入10001端口

终端下进入

mongo localhost:10001

进入后输入初始化方法

MongoDB Enterprise gabriel:OTHER> rs.initiate({_id:"gabriel",members:[
{_id:1,host:"localhost:10001"},
{_id:2,host:"localhost:10002"},
{_id:3,host:"localhost:10003"},
]})

收到如下信息就成功了。

{
	"ok" : 1,
	"operationTime" : Timestamp(1517221411, 1),
	"$clusterTime" : {
		"clusterTime" : Timestamp(1517221411, 1),
		"signature" : {
			"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
			"keyId" : NumberLong(0)
		}
	}
}
MongoDB Enterprise gabriel:OTHER>

此时会发现终端上的输出已经有了变化。

//从单个一个
>
//变成了
gabriel:OTHER>

5.查询状态

MongoDB Enterprise gabriel:OTHER> rs.status()
{
	"set" : "gabriel",
	"date" : ISODate("2018-01-29T10:33:21.227Z"),
	"myState" : 1,
	"term" : NumberLong(1),
	"heartbeatIntervalMillis" : NumberLong(2000),
	"optimes" : {
		"lastCommittedOpTime" : {
			"ts" : Timestamp(1517221984, 1),
			"t" : NumberLong(1)
		},
		"readConcernMajorityOpTime" : {
			"ts" : Timestamp(1517221984, 1),
			"t" : NumberLong(1)
		},
		"appliedOpTime" : {
			"ts" : Timestamp(1517221994, 1),
			"t" : NumberLong(1)
		},
		"durableOpTime" : {
			"ts" : Timestamp(1517221994, 1),
			"t" : NumberLong(1)
		}
	},
	"members" : [
		{
			"_id" : 1,
			"name" : "localhost:10001",
			"health" : 1,
			"state" : 1,
			"stateStr" : "PRIMARY",
			"uptime" : 659,
			"optime" : {
				"ts" : Timestamp(1517221994, 1),
				"t" : NumberLong(1)
			},
			"optimeDate" : ISODate("2018-01-29T10:33:14Z"),
			"electionTime" : Timestamp(1517221422, 1),
			"electionDate" : ISODate("2018-01-29T10:23:42Z"),
			"configVersion" : 1,
			"self" : true
		},
		{
			"_id" : 2,
			"name" : "localhost:10002",
			"health" : 1,
			"state" : 2,
			"stateStr" : "SECONDARY",
			"uptime" : 589,
			"optime" : {
				"ts" : Timestamp(1517221994, 1),
				"t" : NumberLong(1)
			},
			"optimeDurable" : {
				"ts" : Timestamp(1517221984, 1),
				"t" : NumberLong(1)
			},
			"optimeDate" : ISODate("2018-01-29T10:33:14Z"),
			"optimeDurableDate" : ISODate("2018-01-29T10:33:04Z"),
			"lastHeartbeat" : ISODate("2018-01-29T10:33:20.972Z"),
			"lastHeartbeatRecv" : ISODate("2018-01-29T10:33:19.923Z"),
			"pingMs" : NumberLong(0),
			"syncingTo" : "localhost:10001",
			"configVersion" : 1
		},
		{
			"_id" : 3,
			"name" : "localhost:10003",
			"health" : 1,
			"state" : 2,
			"stateStr" : "SECONDARY",
			"uptime" : 589,
			"optime" : {
				"ts" : Timestamp(1517221994, 1),
				"t" : NumberLong(1)
			},
			"optimeDurable" : {
				"ts" : Timestamp(1517221984, 1),
				"t" : NumberLong(1)
			},
			"optimeDate" : ISODate("2018-01-29T10:33:14Z"),
			"optimeDurableDate" : ISODate("2018-01-29T10:33:04Z"),
			"lastHeartbeat" : ISODate("2018-01-29T10:33:20.972Z"),
			"lastHeartbeatRecv" : ISODate("2018-01-29T10:33:19.921Z"),
			"pingMs" : NumberLong(0),
			"syncingTo" : "localhost:10001",
			"configVersion" : 1
		}
	],
	"ok" : 1,
	"operationTime" : Timestamp(1517221994, 1),
	"$clusterTime" : {
		"clusterTime" : Timestamp(1517221994, 1),
		"signature" : {
			"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
			"keyId" : NumberLong(0)
		}
	}
}

在返回中,参数set后面为集群名称,每个members下面可以看到他们各自的情况,其中stateStr是角色,主节点为(PRIMARY)。

6.进入主节点插入数据,进入从节点查看数据

博主主节点在10001接口

mongo localhost:10001

插入数据

MongoDB Enterprise gabriel:PRIMARY> use test
switched to db test
db.col.insert({title: 'MongoDB 教程', 
    description: 'MongoDB 是一个 Nosql 数据库',
    by: '小罗技术笔记-专注于开发技术的研究与知识分享',
    url: 'http://www.yuwowugua.com',
    tags: ['mongodb', 'database', 'NoSQL'],
    likes: 100
})

MongoDB Enterprise gabriel:PRIMARY> db.col.find()
{ "_id" : ObjectId("5a6ef998525d903d07a00cdf"), "title" : "MongoDB 教程", "description" : "MongoDB 是一个 Nosql 数据库", "by" : "小罗技术笔记-专注于开发技术的研究与知识分享", "url" : "http://www.yuwowugua.com", "tags" : [ "mongodb", "database", "NoSQL" ], "likes" : 100 }
MongoDB Enterprise gabriel:PRIMARY>

博主切换从节点10002

mongo localhost:10002

切换到从节点,你会发现使用show dbs 会报错,是因为还没有开启权限,输入rs.slaveOk();就可以顺利访问了。

MongoDB Enterprise gabriel:SECONDARY> show dbs
2018-01-29T10:40:37.362 0000 E QUERY    [thread1] Error: listDatabases failed:{
	"operationTime" : Timestamp(1517222434, 1),
	"ok" : 0,
	"errmsg" : "not master and slaveOk=false",
	"code" : 13435,
	"codeName" : "NotMasterNoSlaveOk",
	"$clusterTime" : {
		"clusterTime" : Timestamp(1517222434, 1),
		"signature" : {
			"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
			"keyId" : NumberLong(0)
		}
	}
} :
_getErrorWithCode@src/mongo/shell/utils.js:25:13
Mongo.prototype.getDBs@src/mongo/shell/mongo.js:65:1
shellHelper.show@src/mongo/shell/utils.js:813:19
shellHelper@src/mongo/shell/utils.js:703:15
@(shellhelp2):1:1
MongoDB Enterprise gabriel:SECONDARY>
MongoDB Enterprise gabriel:SECONDARY> rs.slaveOk()

再次查看

MongoDB Enterprise gabriel:SECONDARY> show dbs
admin   0.000GB
config  0.000GB
local   0.000GB
test    0.000GB
MongoDB Enterprise gabriel:SECONDARY>

切到test 库,查看数据已经同步过来了

MongoDB Enterprise gabriel:SECONDARY> use test
switched to db test
MongoDB Enterprise gabriel:SECONDARY> db.col.find()
{ "_id" : ObjectId("5a6ef998525d903d07a00cdf"), "title" : "MongoDB 教程", "description" : "MongoDB 是一个 Nosql 数据库", "by" : "小罗技术笔记-专注于开发技术的研究与知识分享", "url" : "http://www.yowowugua.com", "tags" : [ "mongodb", "database", "NoSQL" ], "likes" : 100 }
MongoDB Enterprise gabriel:SECONDARY>

以上就是简单的主从复制建立过程,现在已经可以在从服务器看到主服务器插入的数据了。

切换从节点10003 一样的问题

删除从节点

rs.remove('ip:port')

关闭主服务器后,再重新启动,会发现原来的从服务器变为了从服务器,新启动的服务器(原来的从服务器)变为了从服务器

2、 MongoDB 自动故障转移

首先通过 rs.status() 查看,可以看到主节点是10001,主节点"name" :

“localhost:10001”, “stateStr” : “PRIMARY” 接下来停止 10001 主节点,测试故障切换

MongoDB Enterprise gabriel:PRIMARY>  rs.status()
{
	"set" : "gabriel",
	"date" : ISODate("2018-01-30T02:39:58.468Z"),
	"myState" : 1,
	"term" : NumberLong(1),
	"heartbeatIntervalMillis" : NumberLong(2000),
	"optimes" : {
		"lastCommittedOpTime" : {
			"ts" : Timestamp(1517279986, 1),
			"t" : NumberLong(1)
		},
		"readConcernMajorityOpTime" : {
			"ts" : Timestamp(1517279986, 1),
			"t" : NumberLong(1)
		},
		"appliedOpTime" : {
			"ts" : Timestamp(1517279996, 1),
			"t" : NumberLong(1)
		},
		"durableOpTime" : {
			"ts" : Timestamp(1517279996, 1),
			"t" : NumberLong(1)
		}
	},
	"members" : [
		{
			"_id" : 1,
			"name" : "localhost:10001",
			"health" : 1,
			"state" : 1,
			"stateStr" : "PRIMARY",
			"uptime" : 58656,
			"optime" : {
				"ts" : Timestamp(1517279996, 1),
				"t" : NumberLong(1)
			},
			"optimeDate" : ISODate("2018-01-30T02:39:56Z"),
			"electionTime" : Timestamp(1517221422, 1),
			"electionDate" : ISODate("2018-01-29T10:23:42Z"),
			"configVersion" : 1,
			"self" : true
		},
		{
			"_id" : 2,
			"name" : "localhost:10002",
			"health" : 1,
			"state" : 2,
			"stateStr" : "SECONDARY",
			"uptime" : 58586,
			"optime" : {
				"ts" : Timestamp(1517279996, 1),
				"t" : NumberLong(1)
			},
			"optimeDurable" : {
				"ts" : Timestamp(1517279986, 1),
				"t" : NumberLong(1)
			},
			"optimeDate" : ISODate("2018-01-30T02:39:56Z"),
			"optimeDurableDate" : ISODate("2018-01-30T02:39:46Z"),
			"lastHeartbeat" : ISODate("2018-01-30T02:39:58.289Z"),
			"lastHeartbeatRecv" : ISODate("2018-01-30T02:39:57.220Z"),
			"pingMs" : NumberLong(0),
			"syncingTo" : "localhost:10001",
			"configVersion" : 1
		},
		{
			"_id" : 3,
			"name" : "localhost:10003",
			"health" : 0,
			"state" : 8,
			"stateStr" : "(not reachable/healthy)",
			"uptime" : 0,
			"optime" : {
				"ts" : Timestamp(0, 0),
				"t" : NumberLong(-1)
			},
			"optimeDurable" : {
				"ts" : Timestamp(0, 0),
				"t" : NumberLong(-1)
			},
			"optimeDate" : ISODate("1970-01-01T00:00:00Z"),
			"optimeDurableDate" : ISODate("1970-01-01T00:00:00Z"),
			"lastHeartbeat" : ISODate("2018-01-30T02:39:58.304Z"),
			"lastHeartbeatRecv" : ISODate("2018-01-30T02:39:21.208Z"),
			"pingMs" : NumberLong(0),
			"lastHeartbeatMessage" : "Connection refused",
			"configVersion" : -1
		}
	],
	"ok" : 1,
	"operationTime" : Timestamp(1517279996, 1),
	"$clusterTime" : {
		"clusterTime" : Timestamp(1517279996, 1),
		"signature" : {
			"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
			"keyId" : NumberLong(0)
		}
	}
}
MongoDB Enterprise gabriel:PRIMARY>

连接到主节点

mongo localhost:10001

显示所有数据库

MongoDB Enterprise gabriel:PRIMARY> show dbs
admin   0.000GB
config  0.000GB
local   0.000GB
test    0.000GB

切换到admin

MongoDB Enterprise gabriel:PRIMARY> use admin
switched to db admin

停止数据库,必须进入 admin 库

MongoDB Enterprise gabriel:PRIMARY> db.shutdownServer()

响应

2018-01-30T02:51:34.503 0000 I NETWORK  [thread1] trying reconnect to localhost:10001 (127.0.0.1) failed
2018-01-30T02:51:35.398 0000 I NETWORK  [thread1] Socket recv() Connection reset by peer 127.0.0.1:10001
2018-01-30T02:51:35.398 0000 I NETWORK  [thread1] SocketException: remote: (NONE):0 error: SocketException socket exception [RECV_ERROR] server [127.0.0.1:10001] 
2018-01-30T02:51:35.399 0000 I NETWORK  [thread1] reconnect localhost:10001 (127.0.0.1) failed failed 
2018-01-30T02:51:35.404 0000 I NETWORK  [thread1] trying reconnect to localhost:10001 (127.0.0.1) failed
2018-01-30T02:51:35.404 0000 W NETWORK  [thread1] Failed to connect to 127.0.0.1:10001, in(checking socket for error after poll), reason: Connection refused
2018-01-30T02:51:35.404 0000 I NETWORK  [thread1] reconnect localhost:10001 (127.0.0.1) failed failed 
MongoDB Enterprise >

查看是否真的停止了,发现已经没有10001 节点进程了

root@admin-2:# ps -ef | grep mongo
root      5554     1  0 Jan29 ?        00:03:34 mongod --dbpath /data/db/node2 --port 10002 --replSet gabriel --nojournal --fork --logpath /data/db/node2.log
root     12284     1  0 02:43 ?        00:00:02 mongod --dbpath /data/db/node3 --port 10003 --replSet gabriel --nojournal --fork --logpath /data/db/node3.log
root     12436  5132  0 02:53 pts/1    00:00:00 grep --color=auto mongo
root@admin-2:# netstat -nltp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 127.0.0.1:10002         0.0.0.0:*               LISTEN      5554/mongod     
tcp        0      0 127.0.0.1:10003         0.0.0.0:*               LISTEN      12284/mongod          
root@admin-2:/data/db#

查看是否故障专业

root@admin-2:# mongo localhost:10001

查看主从状态

MongoDB Enterprise gabriel:SECONDARY> rs.status()
{
	"set" : "gabriel",
	"date" : ISODate("2018-01-30T02:56:48.074Z"),
	"myState" : 2,
	"term" : NumberLong(2),
	"syncingTo" : "localhost:10003",
	"heartbeatIntervalMillis" : NumberLong(2000),
	"optimes" : {
		"lastCommittedOpTime" : {
			"ts" : Timestamp(1517280995, 1),
			"t" : NumberLong(2)
		},
		"readConcernMajorityOpTime" : {
			"ts" : Timestamp(1517280995, 1),
			"t" : NumberLong(2)
		},
		"appliedOpTime" : {
			"ts" : Timestamp(1517281005, 1),
			"t" : NumberLong(2)
		},
		"durableOpTime" : {
			"ts" : Timestamp(1517280995, 1),
			"t" : NumberLong(2)
		}
	},
	"members" : [
		{
			"_id" : 1,
			"name" : "localhost:10001",
			"health" : 0,
			"state" : 8,
			"stateStr" : "(not reachable/healthy)",
			"uptime" : 0,
			"optime" : {
				"ts" : Timestamp(0, 0),
				"t" : NumberLong(-1)
			},
			"optimeDurable" : {
				"ts" : Timestamp(0, 0),
				"t" : NumberLong(-1)
			},
			"optimeDate" : ISODate("1970-01-01T00:00:00Z"),
			"optimeDurableDate" : ISODate("1970-01-01T00:00:00Z"),
			"lastHeartbeat" : ISODate("2018-01-30T02:56:47.605Z"),
			"lastHeartbeatRecv" : ISODate("2018-01-30T02:51:34.519Z"),
			"pingMs" : NumberLong(0),
			"lastHeartbeatMessage" : "Connection refused",
			"configVersion" : -1
		},
		{
			"_id" : 2,
			"name" : "localhost:10002",
			"health" : 1,
			"state" : 2,
			"stateStr" : "SECONDARY",
			"uptime" : 59660,
			"optime" : {
				"ts" : Timestamp(1517281005, 1),
				"t" : NumberLong(2)
			},
			"optimeDate" : ISODate("2018-01-30T02:56:45Z"),
			"syncingTo" : "localhost:10003",
			"configVersion" : 1,
			"self" : true
		},
		{
			"_id" : 3,
			"name" : "localhost:10003",
			"health" : 1,
			"state" : 1,
			"stateStr" : "PRIMARY",
			"uptime" : 784,
			"optime" : {
				"ts" : Timestamp(1517281005, 1),
				"t" : NumberLong(2)
			},
			"optimeDurable" : {
				"ts" : Timestamp(1517281005, 1),
				"t" : NumberLong(2)
			},
			"optimeDate" : ISODate("2018-01-30T02:56:45Z"),
			"optimeDurableDate" : ISODate("2018-01-30T02:56:45Z"),
			"lastHeartbeat" : ISODate("2018-01-30T02:56:46.486Z"),
			"lastHeartbeatRecv" : ISODate("2018-01-30T02:56:47.147Z"),
			"pingMs" : NumberLong(0),
			"electionTime" : Timestamp(1517280703, 1),
			"electionDate" : ISODate("2018-01-30T02:51:43Z"),
			"configVersion" : 1
		}
	],
	"ok" : 1,
	"operationTime" : Timestamp(1517281005, 1),
	"$clusterTime" : {
		"clusterTime" : Timestamp(1517281005, 1),
		"signature" : {
			"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
			"keyId" : NumberLong(0)
		}
	}
}

MongoDB Enterprise gabriel:SECONDARY>

发现 “name” : “localhost:10001”,”stateStr” : “(not reachable/healthy)”, 健康状态已经是“无法访问状态了”

主节点已经切换成 10003 节点了

"_id" : 3,
"name" : "localhost:10003",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",

重启节点10001

mongod --dbpath /data/db/node1 --port 10001 --replSet gabriel --nojournal --fork --logpath /data/db/node1.log