mongoDB6.0中使用graphLookup

精选原创

雍州无名 2023-04-13 14:02:17 博主文章分类：mongodb4.4 ©著作权

©著作权归作者所有：来自51CTO博客作者雍州无名的原创作品，请联系作者获取转载授权，否则将追究法律责任

注：在mongoDB 5.1 版本中增加了以下功能

在集合上执行递归搜索，可通过递归深度和查询过滤器限制搜索选项。（此功能有点像图数据库 neo4j）

$graphLookup

搜索过程如下：

1.输入文档流入聚合操作的 $graphLookup 阶段。
2.$graphLookup 将搜索目标定位到由 from 参数指定的集合（请参阅下面的完整搜索参数列表）。
3.对于每个输入文档，搜索从 startWith 指定的值开始。
4.$graphLookup 将 startWith 值与来自 from 集合中其他文档中 connectToField 指定字段匹配。
5.对于每个匹配的文档，$graphLookup 获取 connectFromField 的值，并检查 from 集合中所有文档是否存在匹配的 connectToField 值。
  对于每个匹配项，$graphLookup 将在 as 参数命名的数组字段中添加来自 from 集合中的匹配文档。
  此步骤递归进行，直到找不到更多匹配项或者达到 maxDepth 参数指定的递归深度为止。然后 $graphLookup 将数组字段附加到输入文档
  上。完成所有输入文件搜索后，$graphlookup 返回结果。

1.graphLookup语法

{
   $graphLookup: {
      from: <collection>,
      startWith: <expression>,
      connectFromField: <string>,
      connectToField: <string>,
      as: <string>,
      maxDepth: <number>,
      depthField: <string>,
      restrictSearchWithMatch: <document>
   }
}

2.graphLookup字段解释

1.from
  用于 $graphLookup 操作的目标集合，用于递归地匹配 connectFromField 和 connectToField。
  from 集合必须与操作中使用的任何其他集合位于同一个数据库中。
  从 MongoDB 5.1 开始，可以对 from 参数指定的集合进行分片。
2.startWith
  指定开始递归搜索的connectFromField值的表达式。可选地，startWith可以是值数组，每个值都将在遍历过程中单独跟随。
3.connectFromField
  字段名称，其值由$graphLookup用于递归匹配集合中其他文档的connectToField。如果该值是一个数组，则每个元素都将
  通过遍历过程进行跟踪。
4.connectToField
  在其他文档中用来匹配 connectFromField 参数指定的字段值的字段名称。
5.as
  每个输出文档中添加的数组字段名称。包含在$graphLookup阶段遍历以到达该文档的文档。
  注意：在 as 字段返回的文档不保证按任何顺序排列。
6.maxDepth
  可选项。非负整数，指定最大递归深度。
7.depthField
  可选项。要添加到搜索路径中的每个遍历文档的字段名称。该字段的值是表示文档递归深度的NumberLong类型。
  递归深度从零开始，因此第一个查找对应于零深度。
8.restrictSearchWithMatch
  可选项。指定递归搜索的附加条件的文档。语法与查询过滤器语法相同。
  注意：在此筛选器中，您不能使用任何聚合表达式。例如，查询文档如下：
       { lastName: { $ne: "$lastName" } }
       在这种情况下无法工作，以查找姓氏值与输入文档的姓氏值不同的文档，因为"$lastName"将被视为字符串文字而不是字段路径。

3.注意事项

(1).Sharded Collections

从 MongoDB 5.1 开始，您可以在 $graphLookup 阶段的 from 参数中指定分片集合。

(2).Max Depth

将maxDepth字段设置为0等同于非递归的$graphLookup搜索阶段。

(3).Memory

graphLookup 阶段必须保持在 100 兆字节的内存限制之内。如果为 aggregate() 操作指定了 allowDiskUse: true，则 graphLookup 阶段会忽略该选项。如果在 aggregate() 操作中有其他阶段，则这些其他阶段将受到 allowDiskUse: true 选项的影响。

(4).Views and Collation

如果执行涉及多个视图的聚合操作，例如使用 lookup或graphLookup ，则这些视图必须具有相同的排序规则。

4.例子

创建一个名为employees集合

db.employees.insertOne({ "_id" : 1, "name" : "Dev" })
db.employees.insertOne({ "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" })
db.employees.insertOne({ "_id" : 3, "name" : "Ron", "reportsTo" : "Eliot" })
db.employees.insertOne({ "_id" : 4, "name" : "Andrew", "reportsTo" : "Eliot" })
db.employees.insertOne({ "_id" : 5, "name" : "Asya", "reportsTo" : "Ron" })
db.employees.insertOne({ "_id" : 6, "name" : "Dan", "reportsTo" : "Andrew" })

mongoDB6.0中使用graphLookup_字段

以下是 graphLookup 操作，它在 employees 集合中递归匹配 reportsTo 和 name 字段，并返回每个人的报告层次结构：

db.employees.aggregate( [
   {
      $graphLookup: {
         from: "employees",
         startWith: "$reportsTo",
         connectFromField: "reportsTo",
         connectToField: "name",
         as: "reportingHierarchy"
      }
   }
] )

该操作返回以下内容：

mongoDB6.0中使用graphLookup_递归_02

以下表格提供了文档的遍历路径：

{ "_id" : 5, "name" : "Asya", "reportsTo" : "Ron" }:

mongoDB6.0中使用graphLookup_搜索_03

输出生成了层次结构 Asya -> Ron -> Eliot -> Dev。

(2).跨多个集合

与lookup类似，graphLookup可以访问同一数据库中的另一个集合。

例如，创建一个包含两个集合的数据库：airports 和 travelers

db.airports.insertMany( [
   { "_id" : 0, "airport" : "JFK", "connects" : [ "BOS", "ORD" ] },
   { "_id" : 1, "airport" : "BOS", "connects" : [ "JFK", "PWM" ] },
   { "_id" : 2, "airport" : "ORD", "connects" : [ "JFK" ] },
   { "_id" : 3, "airport" : "PWM", "connects" : [ "BOS", "LHR" ] },
   { "_id" : 4, "airport" : "LHR", "connects" : [ "PWM" ] }
] )


db.travelers.insertMany( [
   { "_id" : 1, "name" : "Dev", "nearestAirport" : "JFK" },
   { "_id" : 2, "name" : "Eliot", "nearestAirport" : "JFK" },
   { "_id" : 3, "name" : "Jeff", "nearestAirport" : "BOS" }
] )

对于travelers集合中的每个文档，以下聚合操作会在airports集合中查找nearestAirport值，并递归地将connects字段与airport字段进行匹配。该操作指定了最大递归深度为2。

db.travelers.aggregate( [
   {
      $graphLookup: {
         from: "airports",
         startWith: "$nearestAirport",
         connectFromField: "connects",
         connectToField: "airport",
         maxDepth: 2,
         depthField: "numConnections",
         as: "destinations"
      }
   }
] )

该操作返回以下结果：

{
   "_id" : 1,
   "name" : "Dev",
   "nearestAirport" : "JFK",
   "destinations" : [
      { "_id" : 3,
        "airport" : "PWM",
        "connects" : [ "BOS", "LHR" ],
        "numConnections" : NumberLong(2) },
      { "_id" : 2,
        "airport" : "ORD",
        "connects" : [ "JFK" ],
        "numConnections" : NumberLong(1) },
      { "_id" : 1,
        "airport" : "BOS",
        "connects" : [ "JFK", "PWM" ],
        "numConnections" : NumberLong(1) },
      { "_id" : 0,
        "airport" : "JFK",
        "connects" : [ "BOS", "ORD" ],
        "numConnections" : NumberLong(0) }
   ]
}
{
   "_id" : 2,
   "name" : "Eliot",
   "nearestAirport" : "JFK",
   "destinations" : [
      { "_id" : 3,
        "airport" : "PWM",
        "connects" : [ "BOS", "LHR" ],
        "numConnections" : NumberLong(2) },
      { "_id" : 2,
        "airport" : "ORD",
        "connects" : [ "JFK" ],
        "numConnections" : NumberLong(1) },
      { "_id" : 1,
        "airport" : "BOS",
        "connects" : [ "JFK", "PWM" ],
        "numConnections" : NumberLong(1) },
      { "_id" : 0,
        "airport" : "JFK",
        "connects" : [ "BOS", "ORD" ],
        "numConnections" : NumberLong(0) } ]
}
{
   "_id" : 3,
   "name" : "Jeff",
   "nearestAirport" : "BOS",
   "destinations" : [
      { "_id" : 2,
        "airport" : "ORD",
        "connects" : [ "JFK" ],
        "numConnections" : NumberLong(2) },
      { "_id" : 3,
        "airport" : "PWM",
        "connects" : [ "BOS", "LHR" ],
        "numConnections" : NumberLong(1) },
      { "_id" : 4,
        "airport" : "LHR",
        "connects" : [ "PWM" ],
        "numConnections" : NumberLong(2) },
      { "_id" : 0,
        "airport" : "JFK",
        "connects" : [ "BOS", "ORD" ],
        "numConnections" : NumberLong(1) },
      { "_id" : 1,
        "airport" : "BOS",
        "connects" : [ "JFK", "PWM" ],
        "numConnections" : NumberLong(0) }
   ]
}

以下表格提供了递归搜索的遍历路径，深度为2，起始机场为JFK

mongoDB6.0中使用graphLookup_字段_04

(3).使用查询过滤器

以下示例使用一个包含人名、朋友数组和爱好数组的文档集合。聚合操作找到一个特定的人，并遍历她的连接网络，以查找将高尔夫列为他们爱好之一的人。

以下是为people集合的数据：

db.people.insertOne({
  "_id" : 1,
  "name" : "Tanya Jordan",
  "friends" : [ "Shirley Soto", "Terry Hawkins", "Carole Hale" ],
  "hobbies" : [ "tennis", "unicycling", "golf" ]
})
db.people.insertOne({
  "_id" : 2,
  "name" : "Carole Hale",
  "friends" : [ "Joseph Dennis", "Tanya Jordan", "Terry Hawkins" ],
  "hobbies" : [ "archery", "golf", "woodworking" ]
})
db.people.insertOne({
  "_id" : 3,
  "name" : "Terry Hawkins",
  "friends" : [ "Tanya Jordan", "Carole Hale", "Angelo Ward" ],
  "hobbies" : [ "knitting", "frisbee" ]
})
db.people.insertOne({
  "_id" : 4,
  "name" : "Joseph Dennis",
  "friends" : [ "Angelo Ward", "Carole Hale" ],
  "hobbies" : [ "tennis", "golf", "topiary" ]
})
db.people.insertOne({
  "_id" : 5,
  "name" : "Angelo Ward",
  "friends" : [ "Terry Hawkins", "Shirley Soto", "Joseph Dennis" ],
  "hobbies" : [ "travel", "ceramics", "golf" ]
})
db.people.insertOne({
   "_id" : 6,
   "name" : "Shirley Soto",
   "friends" : [ "Angelo Ward", "Tanya Jordan", "Carole Hale" ],
   "hobbies" : [ "frisbee", "set theory" ]
})

mongoDB6.0中使用graphLookup_递归_05

以下聚合操作使用三个阶段：

1.$match 匹配包含字符串“Tanya Jordan”的name字段的文档。返回一个输出文档。
2.$graphLookup 将输出文档的friends字段与集合中其他文档的name字段连接，以遍历Tanya Jordan的连接网络。此阶段使用restrictSearchWithMatch参数仅查找hobbies数组包含golf的文档。返回一个输出文档。
3.$project 形成输出文档。从输入文档的golfers数组中列出打高尔夫球的人名字取自于connections中列出文件名字字段。

db.people.aggregate( [
  { $match: { "name": "Tanya Jordan" } },
  { $graphLookup: {
      from: "people",
      startWith: "$friends",
      connectFromField: "friends",
      connectToField: "name",
      as: "golfers",
      restrictSearchWithMatch: { "hobbies" : "golf" }
    }
  },
  { $project: {
      "name": 1,
      "friends": 1,
      "connections who play golf": "$golfers.name"
    }
  }
] )

查询结果如下：

mongoDB6.0中使用graphLookup_递归_06