ElasticSearch7.0 关联查询之父子文档
ES7中取消了type这一层级(相当于关系数据库中的table,mongo中的collection),所有文档平铺存放在同一个index中,对于一对多的关联关系,ES7中有两种方式:
- 父子文档,所有文档都是平级的,通过特殊的字段类型
join
来表示层级关系 - 嵌套文档,类似于json中的嵌套数组,需要申明字段类型为
nested
本篇针对父子文档这一类型
官方文档地址:https://www.elastic.co/guide/en/elasticsearch/reference/current/parent-join.html
本项目地址:
https://gitee.com/xiiiao/es-learning.git
创建index
PUT my-index-join_family
{
"mappings":{
"properties":{
"my_id":{
"type":"keyword"
},
"name":{
"type":"keyword"
},
"level":{
"type":"keyword"
},
"join_filed":{ //1关联的字段名,可以随意取
"type":"join", //类型需要定义为join
"relations":{ //定义层级关系,grand_parent ->parent -> child
"grand_parent":"parent",
"parent":"child"
}
}
}
}
}
以上创建了一个祖->父->子的关联关系,一个父可以有多个子,多个子用数组的方式申明
插入顶层父节点
Rest API
PUT my-index-join_family/_doc/1?refresh
{
"my_id": "1",
"name": "grandPa",
"join_filed": { //表名这个文档属于grand_parent这一层级
"name": "grand_parent"
}
}
RestHighLevelClient实现
public void addGrandPa( String name) {
String id = UUID.randomUUID().toString();
JoinFamily member = new JoinFamily();
member.setName(name);
member.setLevel("1");
member.setMy_id(id);
JoinField joinField = new JoinField();
joinField.setName("grand_parent");
member.setJoin_filed(joinField);
String source = JSON.toJSONString(member);
log.info("source: " + source);
IndexRequest indexRequest = new IndexRequest("my-index-join_family").id(id).source(source, XContentType.JSON)
.setRefreshPolicy(WriteRequest.RefreshPolicy.IMMEDIATE);
try {
IndexResponse out = client.index(indexRequest, RequestOptions.DEFAULT);
log.info(out.getId());
} catch (IOException e) {
log.error("", e);
}
}
插入父节点
Rest API
PUT my-index-join_family/_doc/2?routing=1&refresh
{
"my_id": "2",
"name": "parent",
"join_filed": { //表名这个文档属于parent这一层级
"name": "parent" ,
"parent":"1" //父级节点的id,前面的parent字段名固定
}
}
由于子文档需要和父文档在同一分片中,所以需要指定routing参数
RestHighLevelClient实现
public void addParent(String parentId, String name) {
String id = UUID.randomUUID().toString();
JoinFamily member = new JoinFamily();
member.setName(name);
member.setLevel("1");
member.setMy_id(id);
JoinField joinField = new JoinField();
joinField.setName("parent");
joinField.setParent(parentId);
member.setJoin_filed(joinField);
String source = JSON.toJSONString(member);
log.info("source: " + source);
IndexRequest indexRequest = new IndexRequest("my-index-join_family").id(id).source(source, XContentType.JSON)
.setRefreshPolicy(WriteRequest.RefreshPolicy.IMMEDIATE)
.routing(parentId);
try {
IndexResponse out = client.index(indexRequest, RequestOptions.DEFAULT);
log.info(out.getId());
} catch (IOException e) {
log.error("", e);
}
}
child节点插入方式和parent层一致,这里限于篇幅不再赘述。
查询API
Parent-Id-Query
顾名思义,根据parentId进行查询
官方文档https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-parent-id-query.html#query-dsl-parent-id-query
Rest API
GET /my-index-join_family/_search
{
"query": {
"parent_id": {
"type": "parent", //子节点的名称
"id": "1" //父节点的ID
}
}
}
RestHighLevelClient实现
@Test
public void testParentId() throws IOException {
SearchRequest search= new SearchRequest("my-index-join_family");
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
ParentIdQueryBuilder build = JoinQueryBuilders.parentId("parent","1");
searchSourceBuilder.query(build);
search.source(searchSourceBuilder);
SearchResponse response=client.search(search, RequestOptions.DEFAULT);
response.getHits().forEach(hi ->{
System.out.println(hi.getSourceAsString());
});
}
Has-Parent
根据parent中的条件,返回子文档集合
官方文档https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-has-parent-query.html
Rest API
GET /my-index-join_family/_search
{
"query": {
"has_parent": {
"parent_type": "grand_parent",
"query": {
"match": {
"name": "grandPa"
}
}
}
}
}
RestHighLevelClient实现
public void testHasParent() throws IOException {
SearchRequest search= new SearchRequest("my-index-join_family");
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
HasParentQueryBuilder build= JoinQueryBuilders.hasParentQuery("grand_parent", QueryBuilders.matchQuery("name", "grand_parent"), false);
searchSourceBuilder.query(build);
search.source(searchSourceBuilder);
SearchResponse response=client.search(search, RequestOptions.DEFAULT);
response.getHits().forEach(hi ->{
System.out.println(hi.getSourceAsString());
});
}
可以看出,parentId这种查询方式骑士是hasParent的一种特例
Has-Child
根据子文档的条件,返回对应的父文档列表
官方文档https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-parent-id-query.html#query-dsl-parent-id-query
Rest API
GET /my-index-join_family/_search
{
"query": {
"has_child": {
"type": "parent",
"query": {
"match_all": {}
}
}
}
}
RestHighLevelClient实现
public void testHasChild() throws IOException {
SearchRequest search= new SearchRequest("my-index-join_family");
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
HasChildQueryBuilder build=JoinQueryBuilders.hasChildQuery("parent", QueryBuilders.matchQuery("parent", "parent5"), ScoreMode.None);
searchSourceBuilder.query(build);
search.source(searchSourceBuilder);
SearchResponse response=client.search(search, RequestOptions.DEFAULT);
response.getHits().forEach(hi ->{
System.out.println(hi.getSourceAsString());
});
}
查询某一特定层级
除了以上方式,如果想要查询某一特定层级的文档,可以用以下方式:
GET my-index-join_family/_search
{
"query": {
"match": {
"join_filed": "parent"//这里的parent对应relations中的parent
}
}
}
文档聚合
对于具有关联关系的文档,按照某一方式进行聚合也是非常常见的需求,对于父子文档,需要使用特殊的聚合类型children
Rest API
GET my-index-join_family/_search
{
"size": 0,
"query": {
"match": {
"join_filed": "parent"
}
},
"aggs": {
"parent_agg": {//1
"terms": {
"field": "my_id",
"size": 10
},
"aggs": {
"child_agg": {
"children": {//2
"type": "child"
}
}
}
}
}
}
首先第一层按照父文档进行聚合,第二层再根据子文档进行聚合,结果如下
{
"took" : 12,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"parent_agg" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "0088519a-797f-4155-868c-cabb5e9a8a9e",
"doc_count" : 1,
"child_agg" : {
"doc_count" : 0
}
},
{
"key" : "0893e4fd-daf6-48ca-8f96-ee5242aaea42",
"doc_count" : 1,
"child_agg" : {
"doc_count" : 2
}
},
{
"key" : "136fa236-d3f2-4a19-98d8-a8460de531c3",
"doc_count" : 1,
"child_agg" : {
"doc_count" : 1
}
}
]
}
}
}
如果想要继续处理,可以继续进行嵌套
GET my-index-join_family/_search
{
"size": 0,
"query": {
"match": {
"join_filed": "parent"
}
},
"aggs": {
"parent_agg": {
"terms": {
"field": "my_id",
"size": 10
},
"aggs": {
"child_agg": {
"children": {
"type": "child"
},
"aggs": {
"child_name": {
"scripted_metric": {
"init_script": "state.transactions = ''",
"map_script": "state.transactions=state.transactions+' '+doc.name",
"combine_script": " return state.transactions",
"reduce_script": "String profit = ''; for (a in states) { profit += a } return profit"
}
}
}
}
}
}
}
}
RestHighLevelClient实现
@Test
public void testAggChild() throws IOException {
SearchRequest search = new SearchRequest("my-index-join_family");
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
searchSourceBuilder.query(QueryBuilders.matchQuery("join_filed", "parent"));
AggregationBuilder build = AggregationBuilders.terms("parent_agg").field("name")
.subAggregation(JoinAggregationBuilders.children("child_agg", "child"));
searchSourceBuilder.aggregation(build);
search.source(searchSourceBuilder);
SearchResponse response = client.search(search, RequestOptions.DEFAULT);
Map<String, Aggregation> map = response.getAggregations().getAsMap();
Terms terms = (Terms) map.get("parent_agg");
terms.getBuckets().forEach(bucket -> {
System.out.println(bucket.getKeyAsString() + " " + bucket.getDocCount());
Map<String, Aggregation> subMap = bucket.getAggregations().getAsMap();
Children children = (Children) subMap.get("child_agg");
System.out.println("childCount" + children.getDocCount());
});
}
@Test
public void testScriptedMetric() throws IOException {
SearchRequest search = new SearchRequest("my-index-join_family");
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
searchSourceBuilder.query(QueryBuilders.matchQuery("join_filed", "parent"));
AggregationBuilder build = AggregationBuilders.terms("parent_agg").field("name")
.subAggregation(JoinAggregationBuilders.children("child_agg", "child")
.subAggregation(AggregationBuilders.scriptedMetric("child_name")
.initScript(new Script("state.transactions = []"))
.mapScript(new Script("state.transactions.add(doc.name)"))
.combineScript(new Script("String profit =''; for (t in state.transactions) { profit += t } return profit"))
.reduceScript(new Script("String profit = ''; for (a in states) { profit += a } return profit")))
);
searchSourceBuilder.aggregation(build);
search.source(searchSourceBuilder);
SearchResponse response = client.search(search, RequestOptions.DEFAULT);
Map<String, Aggregation> map = response.getAggregations().getAsMap();
Terms terms = (Terms) map.get("parent_agg");
terms.getBuckets().forEach(bucket -> {
System.out.println(bucket.getKeyAsString() + " " + bucket.getDocCount());
Map<String, Aggregation> subMap = bucket.getAggregations().getAsMap();
Children children = (Children) subMap.get("child_agg");
System.out.println("childCount " + children.getDocCount());
Map<String, Aggregation> subSubMap = children.getAggregations().getAsMap();
ScriptedMetric metric = (ScriptedMetric) subSubMap.get("child_name");
System.out.println("childName " + metric.aggregation());
});
}