一、概述

  MongoDB做主从模式有两种,第一种:是MongoDB主从模式,该模式主要是在配置时要明确主服务器(当实际运行过程中主服务器挂了,从服务器不会自动升级到主服务器),另外该模式不能实现从服务器到从服务器的复制,因为从服务器没有oplog。第二种,则是MongoDB副本集,副本集的主要优势在于没有明确固定主服务器(例如当前主服务器挂了,副本集系统会自动在从服务器中竞选一台从服务器升级为主服务器)。

 

二、基础说明

  下面对MongoDB的副本集进行详细讲解。环境如下:

  

mongodb副本集节点状态 mongodb 副本_副本集

 

   结构图如下:由主服务器承担来自应用或者服务接口的相关处理请求

mongodb副本集节点状态 mongodb 副本_服务器_02

 三、安装MongoDB

  MongoDB的安装有多种方式,例如:源码、yum、rpm等。这里直接下载官方的源码。地址:http://downloads.mongodb.org/linux/mongodb-linux-x86_64-rhel62-3.4.9.tgz?_ga=2.225511514.1125230407.1527905895-199343347.1523847839  其实,源码安装非常简单,直接下载然后解压、运行。

  这里,我将所有的配置直接写进一个配置文件,相关操作如下:

1 #基础操作
2 wget http://downloads.mongodb.org/linux/mongodb-linux-x86_64-rhel62-3.4.9.tgz?_ga=2.225511514.1125230407.1527905895-199343347.1523847839
3 mkdir -p /app/mongodb/data
4 mkdir -p /app/mongodb/logs
5 tar xf mongodb-linux-x86_64-rhel62-3.4.9.tgz
6 mv mongodb-linux-x86_64-rhel62-3.4.9 /usr/local/mongodb

 

  基础安装在上面已经完成,下面介绍一下配置文件

vim /usr/local/mongodb/bin/config.conf 
dbpath=/app/mongodb/data
logpath=/app/mongodb/logs/mongodb.log
port=27017
fork=true
nohttpinterface=true
replSet=raytest #上面是常规项不做介绍,该项是启动MongoDB副本集及名称

 

  启动MongoDB

/usr/local/mongodb/bin/mongod --config /usr/local/mongodb/bin/config.conf

 

四、配置MongoDB副本集

  按照上面的操作,现在应该启动了MongoDB(注意:将上述操作在172.20.10.79、80、81三台服务器均进行操作,配置文件也要拷贝相关服务器),接下来进行MongoDB副本集配置,操作如下:

  现在登陆任意一台服务器,执行show dbs操作都会报错,错误原因也说的非常明显,没有master服务器。如下:

/usr/local/mongodb/bin/mongo
    > show dbs
    2018-06-02T18:35:20.269+0800 E QUERY    [thread1] Error: listDatabases failed:{
        "ok" : 0,
        "errmsg" : "not master and slaveOk=false",
        "code" : 13435,
        "codeName" : "NotMasterNoSlaveOk"
    } :
    _getErrorWithCode@src/mongo/shell/utils.js:25:13
    Mongo.prototype.getDBs@src/mongo/shell/mongo.js:62:1
    shellHelper.show@src/mongo/shell/utils.js:769:19
    shellHelper@src/mongo/shell/utils.js:659:15
    @(shellhelp2):1:1

 

  接下来初始化副本集,下面命令只在其中一台MongoDB服务器上面操作,这里在172.20.10.79这台服务器上操作。如下:

#初始化命令
config = { _id:"raytest", members:[
{_id:0,host:"172.20.10.79:27017"},
{_id:1,host:"172.20.10.80:27017"},
{_id:2,host:"172.20.10.81:27017"}]
}
#执行config中的配置初始化
rs.initiate(config)

  下面是执行结果

> config = { _id:"raytest", members:[
... {_id:0,host:"172.20.10.79:27017"},
... {_id:1,host:"172.20.10.80:27017"},
... {_id:2,host:"172.20.10.81:27017"}]
... }
{
    "_id" : "raytest",
    "members" : [
        {
            "_id" : 0,
            "host" : "172.20.10.79:27017"
        },
        {
            "_id" : 1,
            "host" : "172.20.10.80:27017"
        },
        {
            "_id" : 2,
            "host" : "172.20.10.81:27017"
        }
    ]
}
> rs.initiate(config)
{ "ok" : 1 }

 

  这时查看副本集状态,在副本集中的任意一台服务器上去执行rs.status()查看副本集状态,都一样。

raytest:OTHER> rs.status()
{
    "set" : "raytest",
    "date" : ISODate("2018-06-02T10:45:22.416Z"),
    "myState" : 1,
    "term" : NumberLong(1),
    "heartbeatIntervalMillis" : NumberLong(2000),
    "optimes" : {
        "lastCommittedOpTime" : {
            "ts" : Timestamp(1527936315, 1),
            "t" : NumberLong(1)
        },
        "appliedOpTime" : {
            "ts" : Timestamp(1527936315, 1),
            "t" : NumberLong(1)
        },
        "durableOpTime" : {
            "ts" : Timestamp(1527936315, 1),
            "t" : NumberLong(1)
        }
    },
    "members" : [
        {
            "_id" : 0,
            "name" : "172.20.10.79:27017",
            "health" : 1,
            "state" : 1,
            "stateStr" : "PRIMARY",
            "uptime" : 652,
            "optime" : {
                "ts" : Timestamp(1527936315, 1),
                "t" : NumberLong(1)
            },
            "optimeDate" : ISODate("2018-06-02T10:45:15Z"),
            "electionTime" : Timestamp(1527936124, 1),
            "electionDate" : ISODate("2018-06-02T10:42:04Z"),
            "configVersion" : 1,
            "self" : true
        },
        {
            "_id" : 1,
            "name" : "172.20.10.80:27017",
            "health" : 1,
            "state" : 2,
            "stateStr" : "SECONDARY",
            "uptime" : 209,
            "optime" : {
                "ts" : Timestamp(1527936315, 1),
                "t" : NumberLong(1)
            },
            "optimeDurable" : {
                "ts" : Timestamp(1527936315, 1),
                "t" : NumberLong(1)
            },
            "optimeDate" : ISODate("2018-06-02T10:45:15Z"),
            "optimeDurableDate" : ISODate("2018-06-02T10:45:15Z"),
            "lastHeartbeat" : ISODate("2018-06-02T10:45:20.511Z"),
            "lastHeartbeatRecv" : ISODate("2018-06-02T10:45:21.300Z"),
            "pingMs" : NumberLong(2),
            "syncingTo" : "172.20.10.79:27017",
            "configVersion" : 1
        },
        {
            "_id" : 2,
            "name" : "172.20.10.81:27017",
            "health" : 1,
            "state" : 2,
            "stateStr" : "SECONDARY",
            "uptime" : 209,
            "optime" : {
                "ts" : Timestamp(1527936315, 1),
                "t" : NumberLong(1)
            },
            "optimeDurable" : {
                "ts" : Timestamp(1527936315, 1),
                "t" : NumberLong(1)
            },
            "optimeDate" : ISODate("2018-06-02T10:45:15Z"),
            "optimeDurableDate" : ISODate("2018-06-02T10:45:15Z"),
            "lastHeartbeat" : ISODate("2018-06-02T10:45:20.511Z"),
            "lastHeartbeatRecv" : ISODate("2018-06-02T10:45:21.285Z"),
            "pingMs" : NumberLong(2),
            "syncingTo" : "172.20.10.79:27017",
            "configVersion" : 1
        }
    ],
    "ok" : 1
}

 

五、测试 

  分别在主服务器和从服务器插入一条数据,看看是否成功

  主服务器插入数据

raytest:PRIMARY> use ray
switched to db ray
raytest:PRIMARY> db.raytables.insert({"user":"ray"})
WriteResult({ "nInserted" : 1 })
raytest:PRIMARY> show tables
raytables
raytest:PRIMARY> db.raytables.find()
{ "_id" : ObjectId("5b127649e8be407ef3f1d487"), "user" : "ray" }
raytest:PRIMARY>

  可以看到主服务器插入数据成功。下面在从服务器进行操作,插入数据

raytest:SECONDARY> use ray
switched to db ray
raytest:SECONDARY> db.raytables.insert({"name":"jack"})
WriteResult({ "writeError" : { "code" : 10107, "errmsg" : "not master" } })
raytest:SECONDARY>

  可以看到报错,错误也很明显:not master,不是主服务器,不能进行相关操作。可以看看刚刚在主服务器插入的数据从服务器是否同步

raytest:SECONDARY> use ray
switched to db ray
raytest:SECONDARY> show tables
raytables
raytest:SECONDARY> db.raytables.find()
{ "_id" : ObjectId("5b127649e8be407ef3f1d487"), "user" : "ray" }
raytest:SECONDARY>

 

六、模拟故障

  先模式从服务器故障,这里kill掉172.20.10.81服务器的MongoDB

#kill MongoDB
ps axu | grep mongo
root     29240  0.3  1.1 1548624 43920 ?       Sl   18:34   0:04 /usr/local/mongodb/bin/mongod --config /usr/local/mongodb/bin/config.conf
root     29320  0.0  0.0 103252   828 pts/0    R+   18:57   0:00 grep mongo
kill -9 29240
ps axu | grep mongo
root     29322  0.0  0.0 103252   824 pts/0    R+   18:57   0:00 grep mongo

  查看MongoDB副本集的状态

raytest:PRIMARY> rs.status()
{
    "set" : "raytest",
    "date" : ISODate("2018-06-02T10:58:07.311Z"),
    "myState" : 1,
    "term" : NumberLong(1),
    "heartbeatIntervalMillis" : NumberLong(2000),
    "optimes" : {
        "lastCommittedOpTime" : {
            "ts" : Timestamp(1527937085, 1),
            "t" : NumberLong(1)
        },
        "appliedOpTime" : {
            "ts" : Timestamp(1527937085, 1),
            "t" : NumberLong(1)
        },
        "durableOpTime" : {
            "ts" : Timestamp(1527937085, 1),
            "t" : NumberLong(1)
        }
    },
    "members" : [
        {
            "_id" : 0,
            "name" : "172.20.10.79:27017",
            "health" : 1,
            "state" : 1,
            "stateStr" : "PRIMARY",
            "uptime" : 1417,
            "optime" : {
                "ts" : Timestamp(1527937085, 1),
                "t" : NumberLong(1)
            },
            "optimeDate" : ISODate("2018-06-02T10:58:05Z"),
            "electionTime" : Timestamp(1527936124, 1),
            "electionDate" : ISODate("2018-06-02T10:42:04Z"),
            "configVersion" : 1,
            "self" : true
        },
        {
            "_id" : 1,
            "name" : "172.20.10.80:27017",
            "health" : 1,
            "state" : 2,
            "stateStr" : "SECONDARY",
            "uptime" : 974,
            "optime" : {
                "ts" : Timestamp(1527937075, 1),
                "t" : NumberLong(1)
            },
            "optimeDurable" : {
                "ts" : Timestamp(1527937075, 1),
                "t" : NumberLong(1)
            },
            "optimeDate" : ISODate("2018-06-02T10:57:55Z"),
            "optimeDurableDate" : ISODate("2018-06-02T10:57:55Z"),
            "lastHeartbeat" : ISODate("2018-06-02T10:58:05.943Z"),
            "lastHeartbeatRecv" : ISODate("2018-06-02T10:58:06.504Z"),
            "pingMs" : NumberLong(2),
            "syncingTo" : "172.20.10.79:27017",
            "configVersion" : 1
        },
        {
            "_id" : 2,
            "name" : "172.20.10.81:27017",
            "health" : 0,
            "state" : 8,
            "stateStr" : "(not reachable/healthy)",
            "uptime" : 0,
            "optime" : {
                "ts" : Timestamp(0, 0),
                "t" : NumberLong(-1)
            },
            "optimeDurable" : {
                "ts" : Timestamp(0, 0),
                "t" : NumberLong(-1)
            },
            "optimeDate" : ISODate("1970-01-01T00:00:00Z"),
            "optimeDurableDate" : ISODate("1970-01-01T00:00:00Z"),
            "lastHeartbeat" : ISODate("2018-06-02T10:58:05.943Z"),
            "lastHeartbeatRecv" : ISODate("2018-06-02T10:57:56.456Z"),
            "pingMs" : NumberLong(1),
            "lastHeartbeatMessage" : "Connection refused",
            "configVersion" : -1
        }
    ],
    "ok" : 1
}

  可以看到172.20.10.81的状态已经变为了:not reachable/healthy。下面看看其他MongoDB节点的日志(注意:日志会一直打印,注意磁盘空间)

#一直打印下列内容
2018-06-02T18:58:15.955+0800 I ASIO     [NetworkInterfaceASIO-Replication-0] Connecting to 172.20.10.81:27017
2018-06-02T18:58:15.955+0800 I ASIO     [NetworkInterfaceASIO-Replication-0] Failed to connect to 172.20.10.81:27017 - HostUnreachable: Connection refused
2018-06-02T18:58:15.955+0800 I ASIO     [NetworkInterfaceASIO-Replication-0] Dropping all pooled connections to 172.20.10.81:27017 due to failed operation on a connection
2018-06-02T18:58:15.955+0800 I REPL     [ReplicationExecutor] Error in heartbeat request to 172.20.10.81:27017; HostUnreachable: Connection refused

 

  当服务器恢复后,MongoDB副本集又变为正常状态。

raytest:PRIMARY> rs.status()
{
    "set" : "raytest",
    "date" : ISODate("2018-06-02T11:03:17.039Z"),
    "myState" : 1,
    "term" : NumberLong(1),
    "heartbeatIntervalMillis" : NumberLong(2000),
    "optimes" : {
        "lastCommittedOpTime" : {
            "ts" : Timestamp(1527937395, 1),
            "t" : NumberLong(1)
        },
        "appliedOpTime" : {
            "ts" : Timestamp(1527937395, 1),
            "t" : NumberLong(1)
        },
        "durableOpTime" : {
            "ts" : Timestamp(1527937395, 1),
            "t" : NumberLong(1)
        }
    },
    "members" : [
        {
            "_id" : 0,
            "name" : "172.20.10.79:27017",
            "health" : 1,
            "state" : 1,
            "stateStr" : "PRIMARY",
            "uptime" : 1727,
            "optime" : {
                "ts" : Timestamp(1527937395, 1),
                "t" : NumberLong(1)
            },
            "optimeDate" : ISODate("2018-06-02T11:03:15Z"),
            "electionTime" : Timestamp(1527936124, 1),
            "electionDate" : ISODate("2018-06-02T10:42:04Z"),
            "configVersion" : 1,
            "self" : true
        },
        {
            "_id" : 1,
            "name" : "172.20.10.80:27017",
            "health" : 1,
            "state" : 2,
            "stateStr" : "SECONDARY",
            "uptime" : 1284,
            "optime" : {
                "ts" : Timestamp(1527937395, 1),
                "t" : NumberLong(1)
            },
            "optimeDurable" : {
                "ts" : Timestamp(1527937395, 1),
                "t" : NumberLong(1)
            },
            "optimeDate" : ISODate("2018-06-02T11:03:15Z"),
            "optimeDurableDate" : ISODate("2018-06-02T11:03:15Z"),
            "lastHeartbeat" : ISODate("2018-06-02T11:03:16.478Z"),
            "lastHeartbeatRecv" : ISODate("2018-06-02T11:03:17.023Z"),
            "pingMs" : NumberLong(1),
            "syncingTo" : "172.20.10.79:27017",
            "configVersion" : 1
        },
        {
            "_id" : 2,
            "name" : "172.20.10.81:27017",
            "health" : 1,
            "state" : 2,
            "stateStr" : "SECONDARY",
            "uptime" : 6,
            "optime" : {
                "ts" : Timestamp(1527937395, 1),
                "t" : NumberLong(1)
            },
            "optimeDurable" : {
                "ts" : Timestamp(1527937395, 1),
                "t" : NumberLong(1)
            },
            "optimeDate" : ISODate("2018-06-02T11:03:15Z"),
            "optimeDurableDate" : ISODate("2018-06-02T11:03:15Z"),
            "lastHeartbeat" : ISODate("2018-06-02T11:03:16.299Z"),
            "lastHeartbeatRecv" : ISODate("2018-06-02T11:03:14.297Z"),
            "pingMs" : NumberLong(2),
            "syncingTo" : "172.20.10.79:27017",
            "configVersion" : 1
        }
    ],
    "ok" : 1
}

 

  下面模拟主服务器故障(IP:172.20.10.79)

#kill MongoDB Master
[root@mongodb-001 src]# ps aux | grep mong
root     29393  0.3  1.1 1637000 44000 ?       Sl   18:34   0:06 /usr/local/mongodb/bin/mongod --config /usr/local/mongodb/bin/config.conf
root     29527  0.0  0.0 103252   832 pts/0    S+   19:04   0:00 grep mong
[root@mongodb-001 src]# kill -9 29393
[root@mongodb-001 src]# ps aux | grep mong
root     29529  0.0  0.0 103252   828 pts/0    R+   19:05   0:00 grep mong

 

  查看MongoDB副本集状态

raytest:PRIMARY> rs.status()
{
    "set" : "raytest",
    "date" : ISODate("2018-06-02T11:06:01.866Z"),
    "myState" : 1,
    "term" : NumberLong(2),
    "heartbeatIntervalMillis" : NumberLong(2000),
    "optimes" : {
        "lastCommittedOpTime" : {
            "ts" : Timestamp(1527937554, 1),
            "t" : NumberLong(2)
        },
        "appliedOpTime" : {
            "ts" : Timestamp(1527937554, 1),
            "t" : NumberLong(2)
        },
        "durableOpTime" : {
            "ts" : Timestamp(1527937554, 1),
            "t" : NumberLong(2)
        }
    },
    "members" : [
        {
            "_id" : 0,
            "name" : "172.20.10.79:27017",
            "health" : 0,
            "state" : 8,
            "stateStr" : "(not reachable/healthy)",
            "uptime" : 0,
            "optime" : {
                "ts" : Timestamp(0, 0),
                "t" : NumberLong(-1)
            },
            "optimeDurable" : {
                "ts" : Timestamp(0, 0),
                "t" : NumberLong(-1)
            },
            "optimeDate" : ISODate("1970-01-01T00:00:00Z"),
            "optimeDurableDate" : ISODate("1970-01-01T00:00:00Z"),
            "lastHeartbeat" : ISODate("2018-06-02T11:06:01.620Z"),
            "lastHeartbeatRecv" : ISODate("2018-06-02T11:04:44.624Z"),
            "pingMs" : NumberLong(1),
            "lastHeartbeatMessage" : "Connection refused",
            "configVersion" : -1
        },
        {
            "_id" : 1,
            "name" : "172.20.10.80:27017",
            "health" : 1,
            "state" : 1,
            "stateStr" : "PRIMARY",
            "uptime" : 1872,
            "optime" : {
                "ts" : Timestamp(1527937554, 1),
                "t" : NumberLong(2)
            },
            "optimeDate" : ISODate("2018-06-02T11:05:54Z"),
            "infoMessage" : "could not find member to sync from",
            "electionTime" : Timestamp(1527937493, 1),
            "electionDate" : ISODate("2018-06-02T11:04:53Z"),
            "configVersion" : 1,
            "self" : true
        },
        {
            "_id" : 2,
            "name" : "172.20.10.81:27017",
            "health" : 1,
            "state" : 2,
            "stateStr" : "SECONDARY",
            "uptime" : 170,
            "optime" : {
                "ts" : Timestamp(1527937554, 1),
                "t" : NumberLong(2)
            },
            "optimeDurable" : {
                "ts" : Timestamp(1527937554, 1),
                "t" : NumberLong(2)
            },
            "optimeDate" : ISODate("2018-06-02T11:05:54Z"),
            "optimeDurableDate" : ISODate("2018-06-02T11:05:54Z"),
            "lastHeartbeat" : ISODate("2018-06-02T11:06:01.620Z"),
            "lastHeartbeatRecv" : ISODate("2018-06-02T11:06:00.776Z"),
            "pingMs" : NumberLong(1),
            "syncingTo" : "172.20.10.80:27017",
            "configVersion" : 1
        }
    ],
    "ok" : 1
}

  可以看到主服务器由原来的172.20.10.79变为了172.20.10.80这台服务器。下面看看其他节点的日志

2018-06-02T19:07:09.700+0800 I ASIO     [NetworkInterfaceASIO-Replication-0] Connecting to 172.20.10.79:27017
2018-06-02T19:07:09.700+0800 I ASIO     [NetworkInterfaceASIO-Replication-0] Failed to connect to 172.20.10.79:27017 - HostUnreachable: Connection refused
2018-06-02T19:07:09.700+0800 I ASIO     [NetworkInterfaceASIO-Replication-0] Dropping all pooled connections to 172.20.10.79:27017 due to failed operation on a connection
2018-06-02T19:07:09.700+0800 I REPL     [ReplicationExecutor] Error in heartbeat request to 172.20.10.79:27017; HostUnreachable: Connection refused

 

  最后恢复主服务器看看MongoDB副本集的状态

raytest:PRIMARY> rs.status()
{
    "set" : "raytest",
    "date" : ISODate("2018-06-02T11:08:06.269Z"),
    "myState" : 1,
    "term" : NumberLong(2),
    "heartbeatIntervalMillis" : NumberLong(2000),
    "optimes" : {
        "lastCommittedOpTime" : {
            "ts" : Timestamp(1527937684, 1),
            "t" : NumberLong(2)
        },
        "appliedOpTime" : {
            "ts" : Timestamp(1527937684, 1),
            "t" : NumberLong(2)
        },
        "durableOpTime" : {
            "ts" : Timestamp(1527937684, 1),
            "t" : NumberLong(2)
        }
    },
    "members" : [
        {
            "_id" : 0,
            "name" : "172.20.10.79:27017",
            "health" : 1,
            "state" : 2,
            "stateStr" : "SECONDARY",
            "uptime" : 6,
            "optime" : {
                "ts" : Timestamp(1527937684, 1),
                "t" : NumberLong(2)
            },
            "optimeDurable" : {
                "ts" : Timestamp(1527937684, 1),
                "t" : NumberLong(2)
            },
            "optimeDate" : ISODate("2018-06-02T11:08:04Z"),
            "optimeDurableDate" : ISODate("2018-06-02T11:08:04Z"),
            "lastHeartbeat" : ISODate("2018-06-02T11:08:05.792Z"),
            "lastHeartbeatRecv" : ISODate("2018-06-02T11:08:04.376Z"),
            "pingMs" : NumberLong(1),
            "syncingTo" : "172.20.10.80:27017",
            "configVersion" : 1
        },
        {
            "_id" : 1,
            "name" : "172.20.10.80:27017",
            "health" : 1,
            "state" : 1,
            "stateStr" : "PRIMARY",
            "uptime" : 1997,
            "optime" : {
                "ts" : Timestamp(1527937684, 1),
                "t" : NumberLong(2)
            },
            "optimeDate" : ISODate("2018-06-02T11:08:04Z"),
            "electionTime" : Timestamp(1527937493, 1),
            "electionDate" : ISODate("2018-06-02T11:04:53Z"),
            "configVersion" : 1,
            "self" : true
        },
        {
            "_id" : 2,
            "name" : "172.20.10.81:27017",
            "health" : 1,
            "state" : 2,
            "stateStr" : "SECONDARY",
            "uptime" : 295,
            "optime" : {
                "ts" : Timestamp(1527937684, 1),
                "t" : NumberLong(2)
            },
            "optimeDurable" : {
                "ts" : Timestamp(1527937684, 1),
                "t" : NumberLong(2)
            },
            "optimeDate" : ISODate("2018-06-02T11:08:04Z"),
            "optimeDurableDate" : ISODate("2018-06-02T11:08:04Z"),
            "lastHeartbeat" : ISODate("2018-06-02T11:08:05.788Z"),
            "lastHeartbeatRecv" : ISODate("2018-06-02T11:08:04.972Z"),
            "pingMs" : NumberLong(1),
            "syncingTo" : "172.20.10.80:27017",
            "configVersion" : 1
        }
    ],
    "ok" : 1
}

  可以看到主服务器恢复,自动加入到整个MongoDB的副本集,但是由原来的主服务器变为了从服务器,也说明主服务器掉线后恢复不会抢占主服务器权限。

  至此,MongoDB副本集配置已经完成了。下面说明几个在实际运行中遇见的问题

 

七、特殊说明

  1、MongoDB搭建副本集,只能是空库进行搭建,如果MongoDB里面有数据则会在搭建时报错,请先将数据备份,然后清空MongoDB,最后进行搭建

  2、在完成MongoDB副本集建立后,在从服务器执行show dbs这些命令会报错,解决方式是执行:rs.slaveOk()