elasticsearch search-after

原创

汪小哥_ 2022-01-27 11:30:15 博主文章分类：elasticsearch ©著作权

©著作权归作者所有：来自51CTO博客作者汪小哥_的原创作品，请联系作者获取转载授权，否则将追究法律责任

elasticsearch search-after

https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-body.html#request-body-search-search-after

一、使用场景 search-after

可以使用from和size对结果进行分页，但是当达到深度分页时，成本会变得很高。index.max_result_窗口默认为10000，这是一种保护措施，搜索请求占用堆内存和与from+size成比例的时间建议使用scroll api进行高效的深度滚动，但scroll上下文代价高昂，不建议将其用于实时用户请求。search_after参数通过提供一个活动光标来规避此问题。其思想是使用上一页的结果来帮助检索下一页。(Pagination of results can be done by using the from and size but the cost becomes prohibitive when the deep pagination is reached. The index.max_result_window which defaults to 10,000 is a safeguard, search requests take heap memory and time proportional to from + size. The Scroll api is recommended for efficient deep scrolling but scroll contexts are costly and it is not recommended to use it for real time user requests. The search_after parameter circumvents this problem by providing a live cursor. The idea is to use the results from the previous page to help the retrieval of the next page.XXXXXXA)

二、使用

假设检索第一页的查询如下所示：

2.1 请求

GET /bank/account/_search
{
    "size": 5,
    "query": {
        "match_all": {}
    },
    "sort": [
        {
          "_id":"desc",
          "account_number": "asc"
        }      
    ]
}

结果

{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1000,
    "max_score" : null,
    "hits" : [
      {
        "_index" : "bank",
        "_type" : "account",
        "_id" : "994",
        "_score" : null,
        "_source" : {
          "account_number" : 994,
          "balance" : 33298,
          "firstname" : "Madge",
          "lastname" : "Holcomb",
          "age" : 31,
          "gender" : "M",
          "address" : "612 Hawthorne Street",
          "employer" : "Escenta",
          "email" : "madgeholcomb@escenta.com",
          "city" : "Alafaya",
          "state" : "OR"
        },
        "sort" : [
          "994",
          994
        ]
      },
      {
        "_index" : "bank",
        "_type" : "account",
        "_id" : "993",
        "_score" : null,
        "_source" : {
          "account_number" : 993,
          "balance" : 26487,
          "firstname" : "Campos",
          "lastname" : "Olsen",
          "age" : 37,
          "gender" : "M",
          "address" : "873 Covert Street",
          "employer" : "Isbol",
          "email" : "camposolsen@isbol.com",
          "city" : "Glendale",
          "state" : "AK"
        },
        "sort" : [
          "993",
          993
        ]
      }
    ]
  }
}

2.2 再次请求

上述请求的结果包括每个文档的排序值数组。这些排序值可以与search_after参数一起使用，以便在结果列表中的任何文档之后开始返回结果例如，我们可以使用上一个文档的排序值并将其传递给search-after以检索下一页的结果：

GET /bank/account/_search
{
    "size": 2,
    "query": {
        "match_all": {}
    },
    "search_after": ["995",995 ],
    "sort": [
        {
            "_id":"desc",
          "account_number": "asc"
        }      
    ]
}

{
  "took" : 13,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1000,
    "max_score" : null,
    "hits" : [
      {
        "_index" : "bank",
        "_type" : "account",
        "_id" : "994",
        "_score" : null,
        "_source" : {
          "account_number" : 994,
          "balance" : 33298,
          "firstname" : "Madge",
          "lastname" : "Holcomb",
          "age" : 31,
          "gender" : "M",
          "address" : "612 Hawthorne Street",
          "employer" : "Escenta",
          "email" : "madgeholcomb@escenta.com",
          "city" : "Alafaya",
          "state" : "OR"
        },
        "sort" : [
          "994",
          994
        ]
      },
      {
        "_index" : "bank",
        "_type" : "account",
        "_id" : "993",
        "_score" : null,
        "_source" : {
          "account_number" : 993,
          "balance" : 26487,
          "firstname" : "Campos",
          "lastname" : "Olsen",
          "age" : 37,
          "gender" : "M",
          "address" : "873 Covert Street",
          "employer" : "Isbol",
          "email" : "camposolsen@isbol.com",
          "city" : "Glendale",
          "state" : "AK"
        },
        "sort" : [
          "993",
          993
        ]
      }
    ]
  }
}

2.3 注意事项

The parameter from must be set to 0 (or -1) when search_after is used.

2.4 和Scroll的区别

search_after不是自由跳转到随机页面的解决方案，而是并行滚动许多查询。它与scroll API非常相似，但与之不同的是，search_after参数是无状态的，它总是根据搜索器的最新版本进行解析因此，根据索引的更新和删除，排序顺序可能会在遍历期间发生更改。（search_after is not a solution to jump freely to a random page but rather to scroll many queries in parallel. It is very similar to the scroll API but unlike it, the search_after parameter is stateless, it is always resolved against the latest version of the searcher. For this reason the sort order may change during a walk depending on the updates and deletes of your index.）

上一篇：ActiveMQ 与 WebSocket 的结合推送方案+Spring Websocket Stomp

下一篇：记录Druid 监控URL数据、Spring方法没有数据排查过程

提问和评论都可以，用心的回复会被更多人看到评论

发布评论

相关文章

官方博客	全部文章	热门标签	班级博客
了解我们	网站地图	意见反馈

鸿蒙开发者社区	51CTO学堂
51CTO	软考资讯