第十章 多值索引

  • 创建多值索引
  • 索引范围
  • 唯一多值索引
  • 多值索引限制

To index a field that holds an array value, MongoDB creates an index key for each element in the array. These multikey indexes support efficient queries against array fields. Multikey indexes can be constructed over arrays that hold both scalar values [1] (e.g. strings, numbers) and nested documents.

在一个类型为数组的字段上建立索引,MonogoDB会为每一个数组元素映射对应索引项,以提高对数组类型查询的效率。多值索引是构建在数组上,数组元素可以是基本类型,也可以是文档类型。

创建多值索引

To create a multikey index, use the db.collection.createIndex() method:

创建多值索引,使用到了 db.collection.createIndex() 方法。

db.collection_name.createIndex({ <field>: < 1 or -1> })

MongoDB automatically creates a multikey index if any indexed field is an array; you do not need to explicitly specify the multikey type.

MongoDB 会自动地为数组类型的字段创建多值索引;你不需要显示创建多值索引。

唯一多值索引

For unique indexes, the unique constraint applies across separate documents in the collection rather than within a single document.

对于唯一索引来说,它的作用可以在集合中区分各个文档。

Because the unique constraint applies to separate documents, for a unique multikey index, a document may have array elements that result in repeating index key values as long as the index key values for that document do not duplicate those of another document.

因为唯一索引可以限制区分文档,对于唯一多值索引也类似,只要文档的数组字段的元素不于其他文档数组字段的元素不重复即可。

假如,这有 fruit 的集合,有 nameprices 字段:

{
    "_id": ObjectId("6018bded893400009b0000f4"),
    "name": "apple",
    "prices": [ 5, 10]
}

prices 上建立唯一多值索引:

db.fruit.createIndex({prices: 1}, {unique: true})

现在,要插入几个文档:

db.fruit.insert({name: "banana", prices: [6, 10]}) -- 存在重复元素,不能插入

db.apple.insert({name: "banana", prices: [6, 11]}) -- 插入成功
多值索引限制
联合多值索引

For a compound multikey index, each indexed document can have at most one indexed field whose value is an array. That is:

对于联合多值索引来说,联合索引的字段最多只能有一个字段是数组类型。

  • 你不能创建联合多值索引,其联合索引字段有多个字段是数组类型。例如,这里有一个集合:

{ _id: 1, a: [ 1, 2 ], b: [ 1, 2 ], category: “AB - both arrays” }

  • 你不能创建一个联合多值索引 { a : 1, b : 1} 在集合。因为字段a和b都是数组类型。
  • 如果联合多值索引已存在,你不能插入违反这条限制的文档。

{ _id: 1, a: [1, 2], b: 1, category: “A array” }
{ _id: 2, a: 1, b: [1, 2], category: “B array” }

  • A compound multikey index { a: 1, b: 1 } is permissible since for each document, only one field indexed by the compound multikey index is an array; i.e. no document contains array values for both a and b fields.
    联合多值索引 { a : 1, b : 1 } 是被允许的,因为每个文档要么a是数组类型,要么b是数组类型。
    However, after creating the compound multikey index, if you attempt to insert a document where both a and b fields are arrays, MongoDB will fail the insert.
    如果你试图将字段a既是数组类型,字段b也是数组类型,那么MongoDB将会插入失败。

If a field is an array of documents, you can index the embedded fields to create a compound index. For example, consider a collection that contains the following documents:

如果数组字段的元素是文档,你可以在数组元素的文档上创建联合索引。例如,这里有一个集合,包含以下文档:

{ _id: 1, a: [ { x: 5, z: [ 1, 2 ] }, { z: [ 1, 2 ] } ] }
{ _id: 2, a: [ { x: 5 }, { z: 4 } ] }

You can create a compound index on { “a.x”: 1, “a.z”: 1 }. The restriction where at most one indexed field can be an array also applies.

你可以创建联合索引 { “a.x” : 1, “a.z” : 1 }。数组类型字段最多一个索引限制在这里也适用。

排序

As a result of changes to sorting behavior on array fields in MongoDB 3.6, when sorting on an array indexed with a multikey index the query plan includes a blocking SORT stage. The new sorting behavior may negatively impact performance.

MongoDB 3.6在对数组排序有一些变动,当对一个数组进行排序时,如果使用到多值索引,那么在查询计划会包含一个阻塞式排序阶段。这个新的排序行为可能会对性能造成一些影响。

In a blocking SORT, all input must be consumed by the sort step before it can produce output. In a non-blocking, or indexed sort, the sort step scans the index to produce results in the requested order.

在阻塞式排序阶段,所有的输入完毕后再进行排序,然后才能才能产生输出。在非阻塞式或索引排序,排序可以扫描索引后就产生结果。

分片键

You cannot specify a multikey index as the shard key index.

你不能为片键建立多值索引。

However, if the shard key index is a prefix of a compound index, the compound index is allowed to become a compound multikey index if one of the other keys (i.e. keys that are not part of the shard key) indexes an array.

但是,如果片键是联合索引的前缀,这个联合索引是允许成为多值索引(索引key不能包含在片键内)。

Compound multikey indexes can have an impact on performance.

联合多键索引可能会对性能产生影响。

哈希索引

哈希索引不能是多值索引

整体查询数组字段

When a query filter specifies an exact match for an array as a whole, MongoDB can use the multikey index to look up the first element of the query array but cannot use the multikey index scan to find the whole array. Instead, after using the multikey index to look up the first element of the query array, MongoDB retrieves the associated documents and filters for documents whose array matches the array in the query.

当查询条件精确指定要匹配整个数组时,MongoDB 会根据查询条件中指定整个数组的第一个元素,使用多值索引进行查找,而不是使用多值索引扫描整个数组。随后,MongoDB把查找出来的文档与查询条件中指定的数组是否匹配。

For example, consider an inventory collection that contains the following documents:

例如,这有一个集合 inventory 包含以下文档:

{ _id: 5, type: "food", item: "aaa", ratings: [ 5, 8, 9 ] }
{ _id: 6, type: "food", item: "bbb", ratings: [ 5, 9 ] }
{ _id: 7, type: "food", item: "ccc", ratings: [ 9, 5, 8 ] }
{ _id: 8, type: "food", item: "ddd", ratings: [ 9, 5 ] }
{ _id: 9, type: "food", item: "eee", ratings: [ 5, 9, 5 ] }

The collection has a multikey index on the ratings field:

集合有一个多值索引在字段 ratings 上:

db.inventory.createIndex( { ratings: 1 } )

The following query looks for documents where the ratings field is the array [ 5, 9 ]:

查询文档,要求字段 ratings 是 [5 , 9]:

db.inventory.find( { ratings: [ 5, 9 ] } )

MongoDB can use the multikey index to find documents that have 5 at any position in the ratings array. Then, MongoDB retrieves these documents and filters for documents whose ratings array equals the query array [ 5, 9 ].

MongoDB会使用多值索引查找字段 ratings 上包含 5的文档。然后,MongoDB将查找出来的文档的ratins字段与 [5,9]进行匹配过滤。

例子

Consider a survey collection with the following document:

{ _id: 1, item: "ABC", ratings: [ 2, 5, 9 ] }

Create an index on the field ratings:

db.survey.createIndex( { ratings: 1 } )

这个多值索引包含以下三个索引键,每一个都指向相同的文档:

  • 2
  • 5,and
  • 9