在上一篇文章《​​MongoDB 3.4 高可用集群搭建(一):主从模式​​》提到了几个问题还没有解决。

  • 主节点挂了能否自动切换连接?目前需要手工切换。
  • 主节点的读写压力过大如何解决?
  • 从节点每个上面的数据都是对数据库全量拷贝,从节点压力会不会过大?
  • 数据压力大到机器支撑不了的时候能否做到自动扩展?

这篇文章看完这些问题就可以搞定了。NoSQL的产生就是为了解决大数据量、高扩展性、高性能、灵活数据模型、高可用性。但是光通过主从模式的架构远远达不到上面几点,由此MongoDB设计了副本集和分片的功能。这篇文章主要介绍副本集

mongoDB官方已经不建议使用主从模式了,替代方案是采用副本集的模式,​​点击查看​​ ,如图:

MongoDB 3.4 高可用集群搭建(二)replica set 副本集_MongoDB

那什么是副本集呢?打魔兽世界总说打副本,其实这两个概念差不多一个意思。游戏里的副本是指玩家集中在高峰时间去一个场景打怪,会出现玩家暴多怪物少的情况,游戏开发商为了保证玩家的体验度,就为每一批玩家单独开放一个同样的空间同样的数量的怪物,这一个复制的场景就是一个副本,不管有多少个玩家各自在各自的副本里玩不会互相影响。 mongoDB的副本也是这个,主从模式其实就是一个单副本的应用,没有很好的扩展性和容错性。而副本集具有多个副本保证了容错性,就算一个副本挂掉了还有很多副本存在,并且解决了上面第一个问题“主节点挂掉了,整个集群内会自动切换”。难怪mongoDB官方推荐使用这种模式。我们来看看mongoDB副本集的架构图:

MongoDB 3.4 高可用集群搭建(二)replica set 副本集_MongoDB_02

由图可以看到客户端连接到整个副本集,不关心具体哪一台机器是否挂掉。主服务器负责整个副本集的读写,副本集定期同步数据备份,一但主节点挂掉,副本节点就会选举一个新的主服务器,这一切对于应用服务器不需要关心。我们看一下主服务器挂掉后的架构:

MongoDB 3.4 高可用集群搭建(二)replica set 副本集_NoSQL_03

副本集中的副本节点在主节点挂掉后通过心跳机制检测到后,就会在集群内发起主节点的选举机制,自动选举一位新的主服务器。看起来很牛X的样子,我们赶紧操作部署一下!

官方推荐的副本集机器数量为至少3个,那我们也按照这个数量配置测试。

1、准备两台机器 10.202.11.117,10.202.11.118,10.202.37.75。 10.202.11.117 当作副本集主节点,10.202.11.118,10.202.37.75作为副本集副本节点

2、分别在每台机器上建立mongodb副本集测试文件夹

在mongodb-3.4.2目录下创建replset/data

3、下载mongodb的安装程序包



见 《​​MongoDB之二基础入门(window/linux安装启动)》​​中的3.4.2的安装说明


4、分别在每台机器上启动mongodb





./mongod --dbpath=/home/appdeploy/dev/mongodb/mongodb-3.4.2/replset/data --port=27017 --fork --logpath=/home/appdeploy/dev/mongodb/mongodb-3.4.2/logs/mongodb.log --httpinterface --rest --replSet repset
主要是参数:--replSet repset


可以看到控制台上显示副本集还没有配置初始化信息。



2017-03-29T10:57:59.878+0800 I REPL     [initandlisten] Did not find local voted for document at startup.
2017-03-29T10:57:59.878+0800 I REPL [initandlisten] Did not find local replica set configuration document at startup; NoMatchingDocument: Did not find replica set configuration document in local.system.replset


 


5、初始化副本集

在三台机器上任意一台机器登陆mongodb





#使用admin数据库
use admin


#定义副本集配置变量,这里的 _id:”repset” 和上面命令参数“ –replSet repset” 要保持一样。



> config = { _id:"repset", members:[
... {_id:0,host:"10.202.11.117:27017"},
... {_id:1,host:"10.202.11.118:27017"},
... {_id:2,host:"10.202.37.75:27017"}]
... }
{
"_id" : "repset",
"members" : [
{
"_id" : 0,
"host" : "10.202.11.117:27017"
},
{
"_id" : 1,
"host" : "10.202.11.118:27017"
},
{
"_id" : 2,
"host" : "10.202.37.75:27017"
}
]
}


 



#初始化副本集配置





#初始化副本集配置


> rs.initiate(config);

{ "ok" : 1 }

repset:OTHER>


注意上面的变化,标红部分。

 

#查看日志,副本集启动成功后,138为主节点PRIMARY,136、137为副本节点SECONDARY





017-03-29T11:25:07.678+0800 I REPL     [rsSync] transition to SECONDARY
2017-03-29T11:25:09.470+0800 I NETWORK [thread1] connection accepted from 10.202.11.118:28836 #6 (5 connections now open)
2017-03-29T11:25:09.470+0800 I - [conn6] end connection 10.202.11.118:28836 (5 connections now open)
2017-03-29T11:25:09.483+0800 I NETWORK [thread1] connection accepted from 10.202.37.75:46334 #7 (5 connections now open)
2017-03-29T11:25:09.483+0800 I - [conn7] end connection 10.202.37.75:46334 (5 connections now open)
2017-03-29T11:25:09.775+0800 I NETWORK [thread1] connection accepted from 10.202.11.118:28839 #8 (5 connections now open)
2017-03-29T11:25:09.775+0800 I NETWORK [conn8] received client metadata from 10.202.11.118:28839 conn8: { driver: { name: "NetworkInterfaceASIO-RS", version: "3.4.2" }, os: { type: "Linux", name: "CentOS release 6.6 (Final)", architecture: "x86_64", version: "Kernel 2.6.32-504.el6.x86_64" } }
2017-03-29T11:25:09.777+0800 I NETWORK [thread1] connection accepted from 10.202.11.118:28840 #9 (6 connections now open)
2017-03-29T11:25:09.778+0800 I NETWORK [conn9] received client metadata from 10.202.11.118:28840 conn9: { driver: { name: "NetworkInterfaceASIO-RS", version: "3.4.2" }, os: { type: "Linux", name: "CentOS release 6.6 (Final)", architecture: "x86_64", version: "Kernel 2.6.32-504.el6.x86_64" } }
2017-03-29T11:25:10.153+0800 I NETWORK [thread1] connection accepted from 10.202.37.75:46337 #10 (7 connections now open)
2017-03-29T11:25:10.153+0800 I NETWORK [conn10] received client metadata from 10.202.37.75:46337 conn10: { driver: { name: "NetworkInterfaceASIO-RS", version: "3.4.2" }, os: { type: "Linux", name: "CentOS release 6.6 (Final)", architecture: "x86_64", version: "Kernel 2.6.32-504.el6.x86_64" } }
2017-03-29T11:25:10.156+0800 I NETWORK [thread1] connection accepted from 10.202.37.75:46338 #11 (8 connections now open)
2017-03-29T11:25:10.157+0800 I NETWORK [conn11] received client metadata from 10.202.37.75:46338 conn11: { driver: { name: "NetworkInterfaceASIO-RS", version: "3.4.2" }, os: { type: "Linux", name: "CentOS release 6.6 (Final)", architecture: "x86_64", version: "Kernel 2.6.32-504.el6.x86_64" } }
2017-03-29T11:25:12.676+0800 I REPL [ReplicationExecutor] Member 10.202.11.118:27017 is now in state SECONDARY
2017-03-29T11:25:12.677+0800 I REPL [ReplicationExecutor] Member 10.202.37.75:27017 is now in state SECONDARY
2017-03-29T11:25:14.778+0800 I - [conn9] end connection 10.202.11.118:28840 (8 connections now open)
2017-03-29T11:25:15.159+0800 I - [conn11] end connection 10.202.37.75:46338 (7 connections now open)
2017-03-29T11:25:17.996+0800 I REPL [ReplicationExecutor] Starting an election, since we've seen no PRIMARY in the past 10000ms
2017-03-29T11:25:17.996+0800 I REPL [ReplicationExecutor] conducting a dry run election to see if we could be elected
2017-03-29T11:25:17.996+0800 I REPL [ReplicationExecutor] VoteRequester(term 0 dry run) received a yes vote from 10.202.11.118:27017; response message: { term: 0, voteGranted: true, reason: "", ok: 1.0 }
2017-03-29T11:25:17.997+0800 I REPL [ReplicationExecutor] dry election run succeeded, running for election
2017-03-29T11:25:18.039+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Connecting to 10.202.37.75:27017
2017-03-29T11:25:18.044+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Successfully connected to 10.202.37.75:27017
2017-03-29T11:25:18.100+0800 I REPL [ReplicationExecutor] VoteRequester(term 1) received a yes vote from 10.202.11.118:27017; response message: { term: 1, voteGranted: true, reason: "", ok: 1.0 }


#查看集群节点的状态



#查看集群节点的状态


repset:SECONDARY> rs.status()

{

"set" : "repset",

"date" : ISODate("2017-03-29T03:33:14.286Z"),

"myState" : 1,

"term" : NumberLong(1),

"heartbeatIntervalMillis" : NumberLong(2000),

"optimes" : {

"lastCommittedOpTime" : {

"ts" : Timestamp(1490758388, 1),

"t" : NumberLong(1)

},

"appliedOpTime" : {

"ts" : Timestamp(1490758388, 1),

"t" : NumberLong(1)

},

"durableOpTime" : {

"ts" : Timestamp(1490758388, 1),

"t" : NumberLong(1)

}

},

"members" : [

{

"_id" : 0,

"name" : "10.202.11.117:27017",

"health" : 1,

"state" : 1,

"stateStr" : "PRIMARY",

"uptime" : 644,

"optime" : {

"ts" : Timestamp(1490758388, 1),

"t" : NumberLong(1)

},

"optimeDate" : ISODate("2017-03-29T03:33:08Z"),

"electionTime" : Timestamp(1490757918, 1),

"electionDate" : ISODate("2017-03-29T03:25:18Z"),

"configVersion" : 1,

"self" : true

},

{

"_id" : 1,

"name" : "10.202.11.118:27017",

"health" : 1,

"state" : 2,

"stateStr" : "SECONDARY",

"uptime" : 486,

"optime" : {

"ts" : Timestamp(1490758388, 1),

"t" : NumberLong(1)

},

"optimeDurable" : {

"ts" : Timestamp(1490758388, 1),

"t" : NumberLong(1)

},

"optimeDate" : ISODate("2017-03-29T03:33:08Z"),

"optimeDurableDate" : ISODate("2017-03-29T03:33:08Z"),

"lastHeartbeat" : ISODate("2017-03-29T03:33:14.282Z"),

"lastHeartbeatRecv" : ISODate("2017-03-29T03:33:13.087Z"),

"pingMs" : NumberLong(0),

"syncingTo" : "10.202.11.117:27017",

"configVersion" : 1

},

{

"_id" : 2,

"name" : "10.202.37.75:27017",

"health" : 1,

"state" : 2,

"stateStr" : "SECONDARY",

"uptime" : 486,

"optime" : {

"ts" : Timestamp(1490758388, 1),

"t" : NumberLong(1)

},

"optimeDurable" : {

"ts" : Timestamp(1490758388, 1),

"t" : NumberLong(1)

},

"optimeDate" : ISODate("2017-03-29T03:33:08Z"),

"optimeDurableDate" : ISODate("2017-03-29T03:33:08Z"),

"lastHeartbeat" : ISODate("2017-03-29T03:33:12.377Z"),

"lastHeartbeatRecv" : ISODate("2017-03-29T03:33:13.665Z"),

"pingMs" : NumberLong(0),

"syncingTo" : "10.202.11.118:27017",

"configVersion" : 1

}

],

"ok" : 1

}

repset:PRIMARY>


 


整个副本集已经搭建成功了。

6、测试副本集数据复制功能



#在主节点10.202.11.117上连接到终端:
#建立tong 数据库。
#往testdb表插入数据。
repset:PRIMARY> use tong
switched to db tong
repset:PRIMARY> show collections;
repset:PRIMARY> show collections;
repset:PRIMARY> db.testdb.insert({"name":"shenzhen","addr":"nanshan"})
WriteResult({ "nInserted" : 1 })
repset:PRIMARY>

#在副本节点 10.202.11.118,10.202.37.75 上连接到mongodb查看数据是否复制过来。
./mongo
#使用tong 数据库。
repset:SECONDARY> use tong
switched to db tong
repset:SECONDARY> db.testdb.find()
Error: error: {
"ok" : 0,
"errmsg" : "not master and slaveOk=false",
"code" : 13435,
"codeName" : "NotMasterNoSlaveOk"
}
repset:SECONDARY> rs.slaveOk();
repset:SECONDARY> db.testdb.find()
{ "_id" : ObjectId("58db2b6572cf3b348b3cc0f5"), "name" : "shenzhen", "addr" : "nanshan" }
repset:SECONDARY> show tables;
testdb
repset:SECONDARY>


 


7、测试副本集故障转移功能

先停掉主节点mongodb 117,查看118、75的日志可以看到经过一系列的投票选择操作,75当选主节点,118从75同步数据过来。





2017-03-29T11:35:15.315+0800 I NETWORK  [conn8] received client metadata from 127.0.0.1:47480 conn8: { application: { name: "MongoDB Shell" }, driver: { name: "MongoDB Internal Client", version: "3.4.2" }, os: { type: "Linux", name: "CentOS release 6.6 (Final)", architecture: "x86_64", version: "Kernel 2.6.32-504.el6.x86_64" } }
2017-03-29T11:38:58.446+0800 I - [conn7] end connection 10.202.11.117:16217 (3 connections now open)
2017-03-29T11:38:58.699+0800 I REPL [replication-1] Choosing new sync source because our current sync source, 10.202.11.118:27017, has an OpTime ({ ts: Timestamp 1490758728000|1, t: 1 }) which is not ahead of ours ({ ts: Timestamp 1490758728000|1, t: 1 }), it does not have a sync source, and it's not the primary (sync source does not know the primary)
2017-03-29T11:38:58.699+0800 I REPL [replication-1] Canceling oplog query because we have to choose a new sync source. Current source: 10.202.11.118:27017, OpTime { ts: Timestamp 0|0, t: -1 }, its sync source index:-1
2017-03-29T11:38:58.699+0800 W REPL [rsBackgroundSync] Fetcher stopped querying remote oplog with error: InvalidSyncSource: sync source 10.202.11.118:27017 (last visible optime: { ts: Timestamp 0|0, t: -1 }; config version: 1; sync source index: -1; primary index: -1) is no longer valid
2017-03-29T11:38:58.700+0800 I REPL [rsBackgroundSync] could not find member to sync from
2017-03-29T11:38:58.700+0800 I ASIO [ReplicationExecutor] dropping unhealthy pooled connection to 10.202.11.117:27017
2017-03-29T11:38:58.700+0800 I ASIO [ReplicationExecutor] after drop, pool was empty, going to spawn some connections
2017-03-29T11:38:58.700+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Connecting to 10.202.11.117:27017
2017-03-29T11:38:58.702+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Failed to connect to 10.202.11.117:27017 - HostUnreachable: Connection refused
2017-03-29T11:38:58.702+0800 I REPL [ReplicationExecutor] Error in heartbeat request to 10.202.11.117:27017; HostUnreachable: Connection refused


查看整个集群的状态,可以看到117为状态不可达。





#在118上查看状态
repset:SECONDARY> rs.status()
{
"set" : "repset",
"date" : ISODate("2017-03-29T03:43:17.178Z"),
"myState" : 2,
"term" : NumberLong(2),
"syncingTo" : "10.202.37.75:27017",
"heartbeatIntervalMillis" : NumberLong(2000),
"optimes" : {
"lastCommittedOpTime" : {
"ts" : Timestamp(1490758988, 1),
"t" : NumberLong(2)
},
"appliedOpTime" : {
"ts" : Timestamp(1490758988, 1),
"t" : NumberLong(2)
},
"durableOpTime" : {
"ts" : Timestamp(1490758988, 1),
"t" : NumberLong(2)
}
},
"members" : [
{
"_id" : 0,
"name" : "10.202.11.117:27017",
"health" : 0,
"state" : 8,
"stateStr" : "(not reachable/healthy)",
"uptime" : 0,
"optime" : {
"ts" : Timestamp(0, 0),
"t" : NumberLong(-1)
},
"optimeDurable" : {
"ts" : Timestamp(0, 0),
"t" : NumberLong(-1)
},
"optimeDate" : ISODate("1970-01-01T00:00:00Z"),
"optimeDurableDate" : ISODate("1970-01-01T00:00:00Z"),
"lastHeartbeat" : ISODate("2017-03-29T03:43:16.754Z"),
"lastHeartbeatRecv" : ISODate("2017-03-29T03:38:58.418Z"),
"pingMs" : NumberLong(0),
"lastHeartbeatMessage" : "Connection refused",
"configVersion" : -1
},
{
"_id" : 1,
"name" : "10.202.11.118:27017",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 1184,
"optime" : {
"ts" : Timestamp(1490758988, 1),
"t" : NumberLong(2)
},
"optimeDate" : ISODate("2017-03-29T03:43:08Z"),
"syncingTo" : "10.202.37.75:27017",
"configVersion" : 1,
"self" : true
},
{
"_id" : 2,
"name" : "10.202.37.75:27017",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 1087,
"optime" : {
"ts" : Timestamp(1490758988, 1),
"t" : NumberLong(2)
},
"optimeDurable" : {
"ts" : Timestamp(1490758988, 1),
"t" : NumberLong(2)
},
"optimeDate" : ISODate("2017-03-29T03:43:08Z"),
"optimeDurableDate" : ISODate("2017-03-29T03:43:08Z"),
"lastHeartbeat" : ISODate("2017-03-29T03:43:16.624Z"),
"lastHeartbeatRecv" : ISODate("2017-03-29T03:43:16.271Z"),
"pingMs" : NumberLong(0),
"electionTime" : Timestamp(1490758748, 1),
"electionDate" : ISODate("2017-03-29T03:39:08Z"),
"configVersion" : 1
}
],
"ok" : 1
}
repset:SECONDARY>


 在新的PRIMARY节点上新增一条记录,看SECONDARY节点能否同步过去。



#37.75上等上mongo
./mongo
repset:PRIMARY> use tong
switched to db tong
repset:PRIMARY> db.testdb.find()
{ "_id" : ObjectId("58db2b6572cf3b348b3cc0f5"), "name" : "shenzhen", "addr" : "nanshan" }
repset:PRIMARY> db.testdb.insert({"name":"37.75 primary","addr":"75"})
WriteResult({ "nInserted" : 1 })
repset:PRIMARY> db.testdb.find()
{ "_id" : ObjectId("58db2b6572cf3b348b3cc0f5"), "name" : "shenzhen", "addr" : "nanshan" }
{ "_id" : ObjectId("58db312b200c0b77a06fc328"), "name" : "37.75 primary", "addr" : "75" }
repset:PRIMARY>


在10.202.11.118上查看同步结果:



repset:SECONDARY> rs.slaveOk()
repset:SECONDARY> use tong
switched to db tong
repset:SECONDARY> db.testdb.find()
{ "_id" : ObjectId("58db2b6572cf3b348b3cc0f5"), "name" : "shenzhen", "addr" : "nanshan" }
{ "_id" : ObjectId("58db312b200c0b77a06fc328"), "name" : "37.75 primary", "addr" : "75" }
repset:SECONDARY>


再启动原来的主节点 117,发现117变为 SECONDARY,还是37.75为主节点 PRIMARY。

8、java程序连接副本集测试。三个节点有一个节点挂掉也不会影响应用程序客户端对整个副本集的读写!

要引入的jar有:

bson-3.4.2.jar,mongodb-driver-3.4.2.jar,mongodb-driver-core-3.4.2.jar

MongoDB 3.4 高可用集群搭建(二)replica set 副本集_NoSQL_04

 





import java.util.ArrayList;
import java.util.List;

import com.mongodb.BasicDBObject;
import com.mongodb.DB;
import com.mongodb.DBCollection;
import com.mongodb.DBCursor;
import com.mongodb.DBObject;
import com.mongodb.MongoClient;
import com.mongodb.ServerAddress;

public class TestMongoDBReplSet {

public static void main(String[] args) {

try {
List<ServerAddress> addresses = new ArrayList<ServerAddress>();
ServerAddress address1 = new ServerAddress("10.202.11.117", 27017);
ServerAddress address2 = new ServerAddress("10.202.37.75", 27017);
ServerAddress address3 = new ServerAddress("10.202.11.118", 27017);
addresses.add(address1);
addresses.add(address2);
addresses.add(address3);

MongoClient client = new MongoClient(addresses);
DB db = client.getDB("tong");
DBCollection coll = db.getCollection("testdb");

// 鎻掑叆
BasicDBObject object = new BasicDBObject();
Object obj = new Object();
obj = "value";
object.append("test", obj);
coll.insert(object);

DBCursor dbCursor = coll.find();

while (dbCursor.hasNext()) {
DBObject dbObject = dbCursor.next();
System.out.println(dbObject.toString());
}

} catch (Exception e) {
e.printStackTrace();
}

}

}


 结果:



三月 29, 2017 2:13:09 下午 com.mongodb.diagnostics.logging.JULLogger log
信息: Cluster created with settings {hosts=[10.202.11.117:27017, 10.202.37.75:27017, 10.202.11.118:27017], mode=MULTIPLE, requiredClusterType=UNKNOWN, serverSelectionTimeout='30000 ms', maxWaitQueueSize=500}
三月 29, 2017 2:13:09 下午 com.mongodb.diagnostics.logging.JULLogger log
信息: Adding discovered server 10.202.11.117:27017 to client view of cluster
三月 29, 2017 2:13:09 下午 com.mongodb.diagnostics.logging.JULLogger log
信息: Adding discovered server 10.202.37.75:27017 to client view of cluster
三月 29, 2017 2:13:09 下午 com.mongodb.diagnostics.logging.JULLogger log
信息: Adding discovered server 10.202.11.118:27017 to client view of cluster
三月 29, 2017 2:13:09 下午 com.mongodb.diagnostics.logging.JULLogger log
信息: No server chosen by WritableServerSelector from cluster description ClusterDescription{type=UNKNOWN, connectionMode=MULTIPLE, serverDescriptions=[ServerDescription{address=10.202.11.118:27017, type=UNKNOWN, state=CONNECTING}, ServerDescription{address=10.202.11.117:27017, type=UNKNOWN, state=CONNECTING}, ServerDescription{address=10.202.37.75:27017, type=UNKNOWN, state=CONNECTING}]}. Waiting for 30000 ms before timing out
三月 29, 2017 2:13:09 下午 com.mongodb.diagnostics.logging.JULLogger log
信息: Opened connection [connectionId{localValue:3, serverValue:17}] to 10.202.37.75:27017
三月 29, 2017 2:13:09 下午 com.mongodb.diagnostics.logging.JULLogger log
信息: Opened connection [connectionId{localValue:2, serverValue:8}] to 10.202.11.117:27017
三月 29, 2017 2:13:09 下午 com.mongodb.diagnostics.logging.JULLogger log
信息: Opened connection [connectionId{localValue:1, serverValue:15}] to 10.202.11.118:27017
三月 29, 2017 2:13:09 下午 com.mongodb.diagnostics.logging.JULLogger log
信息: Monitor thread successfully connected to server with description ServerDescription{address=10.202.11.117:27017, type=REPLICA_SET_SECONDARY, state=CONNECTED, ok=true, version=ServerVersion{versionList=[3, 4, 2]}, minWireVersion=0, maxWireVersion=5, maxDocumentSize=16777216, roundTripTimeNanos=2525464, setName='repset', canonicalAddress=10.202.11.117:27017, hosts=[10.202.37.75:27017, 10.202.11.117:27017, 10.202.11.118:27017], passives=[], arbiters=[], primary='10.202.37.75:27017', tagSet=TagSet{[]}, electionId=null, setVersion=1, lastWriteDate=Wed Mar 29 14:13:09 CST 2017, lastUpdateTimeNanos=2775072789424179}
三月 29, 2017 2:13:09 下午 com.mongodb.diagnostics.logging.JULLogger log
信息: Monitor thread successfully connected to server with description ServerDescription{address=10.202.37.75:27017, type=REPLICA_SET_PRIMARY, state=CONNECTED, ok=true, version=ServerVersion{versionList=[3, 4, 2]}, minWireVersion=0, maxWireVersion=5, maxDocumentSize=16777216, roundTripTimeNanos=2530154, setName='repset', canonicalAddress=10.202.37.75:27017, hosts=[10.202.37.75:27017, 10.202.11.117:27017, 10.202.11.118:27017], passives=[], arbiters=[], primary='10.202.37.75:27017', tagSet=TagSet{[]}, electionId=7fffffff0000000000000002, setVersion=1, lastWriteDate=Wed Mar 29 14:13:09 CST 2017, lastUpdateTimeNanos=2775072789271174}
三月 29, 2017 2:13:09 下午 com.mongodb.diagnostics.logging.JULLogger log
信息: Discovered cluster type of REPLICA_SET
三月 29, 2017 2:13:09 下午 com.mongodb.diagnostics.logging.JULLogger log
信息: Monitor thread successfully connected to server with description ServerDescription{address=10.202.11.118:27017, type=REPLICA_SET_SECONDARY, state=CONNECTED, ok=true, version=ServerVersion{versionList=[3, 4, 2]}, minWireVersion=0, maxWireVersion=5, maxDocumentSize=16777216, roundTripTimeNanos=3132501, setName='repset', canonicalAddress=10.202.11.118:27017, hosts=[10.202.37.75:27017, 10.202.11.117:27017, 10.202.11.118:27017], passives=[], arbiters=[], primary='10.202.37.75:27017', tagSet=TagSet{[]}, electionId=null, setVersion=1, lastWriteDate=Wed Mar 29 14:13:09 CST 2017, lastUpdateTimeNanos=2775072789873522}
三月 29, 2017 2:13:09 下午 com.mongodb.diagnostics.logging.JULLogger log
信息: Setting max election id to 7fffffff0000000000000002 from replica set primary 10.202.37.75:27017
三月 29, 2017 2:13:09 下午 com.mongodb.diagnostics.logging.JULLogger log
信息: Setting max set version to 1 from replica set primary 10.202.37.75:27017
三月 29, 2017 2:13:09 下午 com.mongodb.diagnostics.logging.JULLogger log
信息: Discovered replica set primary 10.202.37.75:27017
三月 29, 2017 2:13:09 下午 com.mongodb.diagnostics.logging.JULLogger log
信息: Opened connection [connectionId{localValue:4, serverValue:18}] to 10.202.37.75:27017
{ "_id" : { "$oid" : "58db2b6572cf3b348b3cc0f5"} , "name" : "shenzhen" , "addr" : "nanshan"}
{ "_id" : { "$oid" : "58db312b200c0b77a06fc328"} , "name" : "37.75 primary" , "addr" : "75"}
{ "_id" : { "$oid" : "58db5075524e6107003d9978"} , "test" : "value"}


用mongo的shell命令查看结果:



repset:SECONDARY> db.testdb.find()
{ "_id" : ObjectId("58db2b6572cf3b348b3cc0f5"), "name" : "shenzhen", "addr" : "nanshan" }
{ "_id" : ObjectId("58db312b200c0b77a06fc328"), "name" : "37.75 primary", "addr" : "75" }
{ "_id" : ObjectId("58db5075524e6107003d9978"), "test" : "value" }
repset:SECONDARY>


目前看起来支持完美的故障转移了,这个架构是不是比较完美了?其实还有很多地方可以优化,比如开头的第二个问题:主节点的读写压力过大如何解决?常见的解决方案是读写分离,mongodb副本集的读写分离如何做呢?

看图说话:

MongoDB 3.4 高可用集群搭建(二)replica set 副本集_NoSQL_05

常规写操作来说并没有读操作多,所以一台主节点负责写,两台副本节点负责读。

1、设置读写分离需要先在副本节点SECONDARY 设置 setSlaveOk。

2、在程序中设置副本节点负责读操作,如下代码:





import java.util.ArrayList;
import java.util.List;

import com.mongodb.BasicDBObject;
import com.mongodb.DB;
import com.mongodb.DBCollection;
import com.mongodb.DBObject;
import com.mongodb.MongoClient;
import com.mongodb.ReadPreference;
import com.mongodb.ServerAddress;

public class TestMongoDBReplSetReadSplit {

public static void main(String[] args) {

try {
List<ServerAddress> addresses = new ArrayList<ServerAddress>();
ServerAddress address1 = new ServerAddress("10.202.11.117", 27017);
ServerAddress address2 = new ServerAddress("10.202.37.75", 27017);
ServerAddress address3 = new ServerAddress("10.202.11.118", 27017);
addresses.add(address1);
addresses.add(address2);
addresses.add(address3);

MongoClient client = new MongoClient(addresses);
DB db = client.getDB("tong");
DBCollection coll = db.getCollection("testdb");

BasicDBObject object = new BasicDBObject();
object.append("test", "value");

// 读操作从副本节点读取
ReadPreference preference = ReadPreference.secondary();
DBObject dbObject = coll.findOne(object, null, preference);

System.out.println(dbObject);

} catch (Exception e) {
e.printStackTrace();
}
}
}


结果:



信息: Discovered replica set primary 10.202.37.75:27017
三月 29, 2017 2:18:34 下午 com.mongodb.diagnostics.logging.JULLogger log
信息: Opened connection [connectionId{localValue:4, serverValue:17}] to 10.202.11.118:27017
{ "_id" : { "$oid" : "58db5075524e6107003d9978"} , "test" : "value"}


读参数除了secondary一共还有五个参数:primary、primaryPreferred、secondary、secondaryPreferred、nearest。

MongoDB 3.4 高可用集群搭建(二)replica set 副本集_mongodb_06

primary:默认参数,只从主节点上进行读取操作;

primaryPreferred:大部分从主节点上读取数据,只有主节点不可用时从secondary节点读取数据。

secondary:只从secondary节点上进行读取操作,存在的问题是secondary节点的数据会比primary节点数据“旧”。

secondaryPreferred:优先从secondary节点进行读取操作,secondary节点不可用时从主节点读取数据;

nearest:不管是主节点、secondary节点,从网络延迟最低的节点上读取数据。

好,读写分离做好我们可以数据分流,减轻压力解决了“主节点的读写压力过大如何解决?”这个问题。不过当我们的副本节点增多时,主节点的复制压力会加大有什么办法解决吗?mongodb早就有了相应的解决方案。

看图:

MongoDB 3.4 高可用集群搭建(二)replica set 副本集_NoSQL_07

其中的仲裁节点不存储数据,只是负责故障转移的群体投票,这样就少了数据复制的压力。是不是想得很周到啊,一看mongodb的开发兄弟熟知大数据架构体系,其实不只是主节点、副本节点、仲裁节点,还有Secondary-Only、Hidden、Delayed、Non-Voting。

Secondary-Only:不能成为primary节点,只能作为secondary副本节点,防止一些性能不高的节点成为主节点。

Hidden:这类节点是不能够被客户端制定IP引用,也不能被设置为主节点,但是可以投票,一般用于备份数据。

Delayed:可以指定一个时间延迟从primary节点同步数据。主要用于备份数据,如果实时同步,误删除数据马上同步到从节点,恢复又恢复不了。

Non-Voting:没有选举权的secondary节点,纯粹的备份数据节点。

到此整个mongodb副本集搞定了两个问题:

  • 主节点挂了能否自动切换连接?目前需要手工切换。
  • 主节点的读写压力过大如何解决?

还有这两个问题后续解决:

  • 从节点每个上面的数据都是对数据库全量拷贝,从节点压力会不会过大?
  • 数据压力大到机器支撑不了的时候能否做到自动扩展?

做了副本集发现又一些问题:

  • 副本集故障转移,主节点是如何选举的?能否手动干涉下架某一台主节点。
  • 官方说副本集数量最好是奇数,为什么?
  • mongodb副本集是如何同步的?如果同步不及时会出现什么情况?会不会出现不一致性?
  • mongodb的故障转移会不会无故自动发生?什么条件会触发?频繁触发可能会带来系统负载加重