MongoDB分片详解 mongodb分片命令

转载

mob64ca1402d47a 2023-08-17 18:45:07

文章标签 MongoDB分片详解服务器数据库数据 文章分类 MongoDB 数据库

经过上篇的学习，我们搭建了自己的分片系统（通俗点就是MongoDB数据库集群系统），我们通过如下命令将两个mongod的服务作为“片”添加到系统中，并且让数据库“mydb”的分片功能打开，指定集合“users”的片键为“name”：

C:\Users\liuxj>mongo localhost:30000/admin
MongoDB shell version: 2.0.6
connecting to: localhost:30000/admin
mongos> db.runCommand({"addshard" : "localhost:10001", "allowLocal" : true });
{ "shardAdded" : "shard0000", "ok" : 1 }
mongos> db.runCommand({"addshard" : "localhost:10002", "allowLocal" : true });
{ "shardAdded" : "shard0001", "ok" : 1 }
mongos> db.runCommand({"enablesharding" : "mydb" });
{ "ok" : 1 }
mongos> db.runCommand({"shardcollection" : "mydb.users", "key" : {"name" : 1}});
{ "collectionsharded" : "mydb.users", "ok" : 1 }
mongos>

【管理分片--查询配置信息】

分片的相关信息都会保存在分片系统的配置服务器中，具体在config数据库下，我们可以连接到mongos，查看这个数据库的相关信息：

1》片信息

mongos> use config
switched to db config
mongos> db.shards.find();
{ "_id" : "shard0000", "host" : "localhost:10001" }
{ "_id" : "shard0001", "host" : "localhost:10002" }
mongos>

片信息保存在集合shards中。

2》数据库

mongos> use config;
switched to db config
mongos> db.databases.find();
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "mydb", "partitioned" : true, "primary" : "shard0000" }
{ "_id" : "testdb", "partitioned" : false, "primary" : "shard0001" }
mongos>

集合databases中含有已经在片系统上的数据库列表，我们看这个集合中的键：

--------1）“_id” ：数据库名称

--------2）“partitioned”：布尔，表明该库是否开启分片功能

--------3) “primary”：表明这个数据库的大本营在哪个片上，每个数据库刚创建时会随机定位到一个片上！分片的数据库会在后期分布到很多数据库服务其上，但会从这个片开始！

3》块

mongos> use config;
switched to db config
mongos> db.chunks.find();
{ "_id" : "mydb.users-name_MinKey", "lastmod" : { "t" : 1000, "i" : 0 }, "ns" : "mydb.users", "min" : { "name" : { $minKey : 1 } }, "max" : {
 { $maxKey : 1 } }, "shard" : "shard0000" }
mongos>

块信息保存在chunks集合中，从这里我们可以看出数据到底是怎么切分到集群的！

【管理分片--分片命令】

前面我们提到了一些基本命令，如添加块、启动分片等。我们再提供管理分片系统（集群）的常用命令：

1》printShardingStatus：给出整个分片系统的一些状态信息：

mongos> db.printShardingStatus();
--- Sharding Status ---
  sharding version: { "_id" : 1, "version" : 3 }
  shards:
        {  "_id" : "shard0000",  "host" : "localhost:10001" }
        {  "_id" : "shard0001",  "host" : "localhost:10002" }
  databases:
        {  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }
        {  "_id" : "mydb",  "partitioned" : true,  "primary" : "shard0000" }
                mydb.users chunks:
                                shard0000       1
                        { "name" : { $minKey : 1 } } -->> { "name" : { $maxKey : 1 } } on : shard0000 { "t" : 1000, "i" : 0 }
        {  "_id" : "testdb",  "partitioned" : false,  "primary" : "shard0001" }

mongos>

其集中上面配置集合的所有信息。

2》删除片，通过removeshard命令（admin数据库下执行），会从集群中删除片，removeshard会把给定片上的所有块都挪到其他片上！我们做个试验，我们通过mongos往mydb.user集合中插入两条数据：

mongos> use mydb;
switched to db mydb
mongos> db.users.insert({"name" : "tom"});
mongos> db.users.insert({"name" : "jimmy"});
mongos>

mydb的大本营是第一个片，即localhost:10001，我们通过shell连接到localhost:10001，查看该集合：

> use mydb;
switched to db mydb
> db.users.find();
{ "_id" : ObjectId("5045f5e5c2f66c42c6f6a6fb"), "name" : "tom" }
{ "_id" : ObjectId("5045f5ecc2f66c42c6f6a6fc"), "name" : "jimmy" }
>

数据在这里。我们再查看locahost:10002:

> use mydb;
switched to db mydb
> db.users.find();
>

没有任何数据。然后我们将片1从集群中删除：

mongos> use admin;
switched to db admin
mongos> db.runCommand({"removeshard" : "localhost:10001"});
{
        "msg" : "draining started successfully",
        "state" : "started",
        "shard" : "shard0000",
        "ok" : 1
}
mongos>

此时，我们再查询localhost:10002：

> db.users.find();
{ "_id" : ObjectId("5045f5e5c2f66c42c6f6a6fb"), "name" : "tom" }
{ "_id" : ObjectId("5045f5ecc2f66c42c6f6a6fc"), "name" : "jimmy" }
>

数据已经被转移到片2上了。片删除成功。

【生产配置】

MongoDB分片系统（集群系统）如果应用在生产环境中，就需要一些更加健壮的方案，我们现在就讨论一下如何让集合更稳定。通常，我们的做法为：

1》使用多个配置服务器

2》多个mongos路由服务器

3》每个片都是一个副本集

（一）使用多个配置服务器，上面，我们启动mongos路由服务只指定了一个配置服务器，我们现在指定3个：

C:\Users\liuxj>mongos --port 30000 --configdb localhost:20000,localhost:20001,localhost:20002
Wed Sep 05 20:02:38 mongos db version v2.0.6, pdfile version 4.5 starting (--help for usage)
Wed Sep 05 20:02:38 git version: e1c0cbc25863f6356aa4e31375add7bb49fb05bc
Wed Sep 05 20:02:38 build info: windows sys.getwindowsversion(major=6, minor=1, build=7601, platform=2, service_pack='Service Pack 1') BOOST_LIB_VERSI
ON=1_42
Wed Sep 05 20:02:38 SyncClusterConnection connecting to [localhost:20000]
Wed Sep 05 20:02:38 SyncClusterConnection connecting to [localhost:20001]
Wed Sep 05 20:02:38 SyncClusterConnection connecting to [localhost:20002]

这里需要注意的是，这3个配置服务器初始内容得一致（刚开始要同为空），否则无法再这3个服务器上启动路由服务。多个配置服务器为维护集群状态的一致性，其数据同步没有采用前面所讲的异步复制机制，而是采用了更加严谨的两步提交机制（关系型数据库的分布式事务的机制），这就意味着，如果这3台配置服务器有一台宕掉的话，这个集群系统的配置信息将是只读的（写操作都会因为一台无法进行而回滚）；客户端仍然可以读写数据信息，但写入的数据不会进行片之间的均衡分配，只会保存在某一片上。

（二）多个mongos路由服务器，mongos的数量是不受限制的！但建议一个MongoDB集群系统只需一个mongos进程即可（这点，我还需考证，生产环境中没使用过，也请使用过的给点点评）。

（三）采用副本集做“片”！这是生产环境中必须的！一个片中一台机器出问题了，不能导致整个片的失效！我们将一个副本集加入到集群中的方法和将当个mongod服务加入集群的方法是一致的。使用addshard命令，这里我们要指定副本集的名称和活跃节点即可：db.runCommand("addshard" : "yy/test.example.com:27017")，我们副本集的名称为“yy”，目前的活跃节点为：“test.example.com:27017”。

生产环境中，我们必须谨记：不可把所有鸡蛋放在同一个篮子里！如上面我们提到的，不能把所有的配置服务器放到一个服务器上，不能把一个副本集放到一个服务器上。

本文章为转载内容，我们尊重原作者对文章享有的著作权。如有内容错误或侵权问题，欢迎原作者联系我们进行内容更正或删除文章。