Query DSL结合springboot使用

  • Query DSL
  • 数据准备
  • match_all
  • 术语级查询
  • Term Query
  • Terms Query
  • Exists Query
  • Ids Query
  • Range Query
  • Prefix Query
  • Wildcard Query
  • Fuzzy Query


Query DSL

Elasticsearch 提供了基于 JSON 的完整 Query DSL(Domain Specific Language)来定义查询。
因Query DSL是利用Rest API传递JSON格式的请求体(RequestBody)数据与ES进行交互,所以我们在使用springboot的时候也可以很方便的进行集成,本文主要讲述的就是使用springboot实现各类DSL的语法查询。
Elasticsearch 官网地址

数据准备

新增名为(dsl_index)的索引,并插入部分数据,本文使用springboot变更Elasticsearch数据都是通过RestHighLevelClient来操作的

索引(dsl_index)结构:

GET dsl_index/_mappings

{
  "dsl_index" : {
    "mappings" : {
      "properties" : {
        "age" : {
          "type" : "long"
        },
        "description" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "name" : {
          "type" : "text",
          "analyzer" : "ik_max_word",
          "search_analyzer" : "ik_smart"
        }
      }
    }
  }
}

索引(dsl_index)数据:

POST /dsl_index/_bulk
{"index":{"_id":1}}
{"name":"张三","age":11,"description":"南京市 羽毛球爱好者"}
{"index":{"_id":2}}
{"name":"王五","age":15,"description":"北京市 篮球两年半"}
{"index":{"_id":3}}
{"name":"李四","age":18,"description":"山东省 游泳健身"}
{"index":{"_id":4}}
{"name":"富贵","age":22,"description":"天津市 游泳打球"}
{"index":{"_id":5}}
{"name":"来福","age":8,"description":"安徽合肥 职业代练"}
{"index":{"_id":6}}
{"name":"憨憨","age":27,"description":"北京市 健身打球"}
{"index":{"_id":7}}
{"name":"小七","age":31,"description":"北京市 游泳"}

match_all

match_all会查询指定索引下的所有文档,但是默认只会返回10条数据。原因是:_search查询默认采用的是分页查询,from=0;size=10 如果想显示更多数据,指定size数量

DSL: 查询当前索引下所有数据(默认前十条)

GET dsl_index/_search
{
  "query": {
    "match_all": {}
  }
}

返回数据如下:
{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 7,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "dsl_index",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "name" : "张三",
          "age" : 11,
          "description" : "南京市 羽毛球爱好者"
        }
      },
      {
        "_index" : "dsl_index",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "name" : "王五",
          "age" : 15,
          "description" : "北京市 篮球两年半"
        }
      },
      {
        "_index" : "dsl_index",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 1.0,
        "_source" : {
          "name" : "李四",
          "age" : 18,
          "description" : "山东省 游泳健身"
        }
      },
      {
        "_index" : "dsl_index",
        "_type" : "_doc",
        "_id" : "4",
        "_score" : 1.0,
        "_source" : {
          "name" : "富贵",
          "age" : 22,
          "description" : "天津市 游泳打球"
        }
      },
      {
        "_index" : "dsl_index",
        "_type" : "_doc",
        "_id" : "5",
        "_score" : 1.0,
        "_source" : {
          "name" : "来福",
          "age" : 8,
          "description" : "安徽合肥 职业代练"
        }
      },
      {
        "_index" : "dsl_index",
        "_type" : "_doc",
        "_id" : "6",
        "_score" : 1.0,
        "_source" : {
          "name" : "憨憨",
          "age" : 27,
          "description" : "北京市 健身打球"
        }
      },
      {
        "_index" : "dsl_index",
        "_type" : "_doc",
        "_id" : "7",
        "_score" : 1.0,
        "_source" : {
          "name" : "小七",
          "age" : 31,
          "description" : "北京市 游泳"
        }
      }
    ]
  }
}

sprongboot实现:

代码:
    private static final String INDEX_NAME = "dsl_index"; -- 以下统一使用该索引
        
    @Resource
   	private RestHighLevelClient client; -- 以下统一使用该client

    @RequestMapping(value = "/matchAll", method = RequestMethod.GET)
    @ApiOperation(value = "DSL - match_all")
    public void matchAll() throws Exception {
        // 定义请求对象
        SearchRequest searchRequest = new SearchRequest(INDEX_NAME);
        // 查询所有
        searchRequest.source(new SearchSourceBuilder().query(QueryBuilders.matchAllQuery()));
        // 打印返回数据
        printLog(client.search(searchRequest, RequestOptions.DEFAULT));
    }

查询结果如下:
		{name=张三, description=南京市 羽毛球爱好者, age=11}
		{name=王五, description=北京市 篮球两年半, age=15}
		{name=李四, description=山东省 游泳健身, age=18}
		{name=富贵, description=天津市 游泳打球, age=22}
		{name=来福, description=安徽合肥 职业代练, age=8}
		{name=憨憨, description=北京市 健身打球, age=27}
		{name=小七, description=北京市 游泳, age=31}

术语级查询

官网地址

术语级别查询(Term-Level Queries)指的是搜索内容不经过文本分析直接用于文本匹配,这个过程类似于数据库的SQL查询,搜索的对象大多是索引的非text类型字段

Term Query

术语查询直接返回包含搜索内容的文档,常用来查询索引中某个类型为keyword的文本字段,类似于SQL的“=”查询,因此最好不要在term查询的字段中使用text字段,因为text字段会被分词,这样做既没有意义,还很有可能什么也查不到。

DSL: 查询当前索引下age=31的数据

GET dsl_index/_search
{
  "query": {
    "term": {
      "age": {
        "value": "31"
      }
    }
  }
}

返回数据如下:
{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "dsl_index",
        "_type" : "_doc",
        "_id" : "7",
        "_score" : 1.0,
        "_source" : {
          "name" : "小七",
          "age" : 31,
          "description" : "北京市 游泳"
        }
      }
    ]
  }
}

springboot实现:

代码如下:
    @RequestMapping(value = "/term", method = RequestMethod.GET)
    @ApiOperation(value = "DSL - term")
    public void term() throws Exception {
        // 定义请求对象
        SearchRequest searchRequest = new SearchRequest(INDEX_NAME);
        // 查询
        searchRequest.source(new SearchSourceBuilder().query(QueryBuilders.termQuery("age",31)));
        // 打印返回数据
        printLog(client.search(searchRequest, RequestOptions.DEFAULT));
    }

查询结果如下:
{name=小七, description=北京市 游泳, age=31}

Terms Query

Terms query用于在指定字段上匹配多个词项(terms)。它会精确匹配指定字段中包含的任何一个词项。

DSL: 查询当前索引中age 为31或者15的数据,类似mysql的age in (‘15’,‘31’)

GET dsl_index/_search
{
  "query": {
    "terms": {
      "age": ["31","15"]
    }
  }
}

返回结果如下:
{
  "took" : 13,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "dsl_index",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "name" : "王五",
          "age" : 15,
          "description" : "北京市 篮球两年半"
        }
      },
      {
        "_index" : "dsl_index",
        "_type" : "_doc",
        "_id" : "7",
        "_score" : 1.0,
        "_source" : {
          "name" : "小七",
          "age" : 31,
          "description" : "北京市 游泳"
        }
      }
    ]
  }
}

springboot实现:

@RequestMapping(value = "/terms", method = RequestMethod.GET)
    @ApiOperation(value = "DSL - terms")
    public void terms() throws Exception {
        // 定义请求对象
        SearchRequest searchRequest = new SearchRequest(INDEX_NAME);
        // 查询
        searchRequest.source(new SearchSourceBuilder().query(QueryBuilders.termsQuery("age", new String[]{"15", "31"})));
        // 打印返回数据
        printLog(client.search(searchRequest, RequestOptions.DEFAULT));
    }
    
返回结果如下:
{name=王五, description=北京市 篮球两年半, age=15}
{name=小七, description=北京市 游泳, age=31}

Exists Query

在Elasticsearch中可以使用exists进行查询,判断文档中是否存在对应的字段。

DSL: 判断当前索引中是否存在sex字段

GET dsl_index/_search
{
  "query": {
    "exists": {
      "field": "sex"
    }
  }
}

返回结果如下: 很明显返回值hits中并无数据,说明当前索引中没有sex字段
{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}

springboot实现:

@RequestMapping(value = "/exists", method = RequestMethod.GET)
    @ApiOperation(value = "DSL - exists")
    public void exists() throws Exception {
        // 定义请求对象
        SearchRequest searchRequest = new SearchRequest(INDEX_NAME);
        // 查询
        searchRequest.source(new SearchSourceBuilder().query(QueryBuilders.existsQuery("sex")));
        // 打印返回数据
        SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT)
        SearchHits hits = searchResponse.getHits();
        System.out.println("返回hits数组长度:" + hits.getHits().length);
    }

返回结果如下:
返回hits数组长度:0

Ids Query

ids 关键字 : 值为每条文档的默认主键,根据一组id获取多个对应的文档

DSL: 查询_id为1或者2的数据

GET dsl_index/_search
{
  "query": {
    "ids": {
      "values": [1,2]
    }
  }
}

返回数据如下:
{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "dsl_index",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "name" : "张三",
          "age" : 11,
          "description" : "南京市 羽毛球爱好者"
        }
      },
      {
        "_index" : "dsl_index",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "name" : "王五",
          "age" : 15,
          "description" : "北京市 篮球两年半"
        }
      }
    ]
  }
}

springboot实现:

@RequestMapping(value = "/ids", method = RequestMethod.GET)
    @ApiOperation(value = "DSL - ids")
    public void ids() throws Exception {
        // 定义请求对象
        SearchRequest searchRequest = new SearchRequest(INDEX_NAME);
        // 第一种方式
        // searchRequest.source(new SearchSourceBuilder().query(QueryBuilders.termsQuery("_id","1","2")));
        // 第二种方式
        searchRequest.source(new SearchSourceBuilder().query(QueryBuilders.idsQuery().addIds(new String[]{"1","2"})));
        // 打印返回数据
        printLog(client.search(searchRequest, RequestOptions.DEFAULT));
    }

返回结果如下:
{name=张三, description=南京市 羽毛球爱好者, age=11}
{name=王五, description=北京市 篮球两年半, age=15}

Range Query

范围查询:

  • range:范围关键字
  • gte 大于等于
  • lte 小于等于
  • gt 大于
  • lt 小于
  • now 当前时间

DSL: 查询当前索引下age>=20且age<=30的数据

GET dsl_index/_search
{
  "query": {
    "range": {
      "age": {
        "gte": 20,
        "lte": 30
      }
    }
  }
}

返回数据如下:
{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "dsl_index",
        "_type" : "_doc",
        "_id" : "4",
        "_score" : 1.0,
        "_source" : {
          "name" : "富贵",
          "age" : 22,
          "description" : "天津市 游泳打球"
        }
      },
      {
        "_index" : "dsl_index",
        "_type" : "_doc",
        "_id" : "6",
        "_score" : 1.0,
        "_source" : {
          "name" : "憨憨",
          "age" : 27,
          "description" : "北京市 健身打球"
        }
      }
    ]
  }
}

springboot:

@RequestMapping(value = "/range", method = RequestMethod.GET)
    @ApiOperation(value = "DSL - range")
    public void range() throws Exception {
        // 定义请求对象
        SearchRequest searchRequest = new SearchRequest(INDEX_NAME);
        // 查询
        searchRequest.source(new SearchSourceBuilder().query(QueryBuilders.rangeQuery("age").gte(20).lte(30)));
        // 打印返回数据
        printLog(client.search(searchRequest, RequestOptions.DEFAULT));
    }

返回结果如下:
{name=富贵, description=天津市 游泳打球, age=22}
{name=憨憨, description=北京市 健身打球, age=27}

Prefix Query

前缀查询 :

  • 它不会分析要搜索字符串,传入的前缀就是想要查找的前缀
  • prefix的原理:需要遍历所有倒排索引,并比较每个term是否以所指定的前缀开头。
    此时当前索引中的数据如下:
  • 默认状态下,前缀查询不做相关度分数计算,它只是将所有匹配的文档返回,然后赋予所有相关分数值为1。它的行为更像是一个过滤器而不是查询。两者实际的区别就是过滤器是可以被缓存的,而前缀查询不行。
{name=张三, description=南京市 羽毛球爱好者, age=11}
{name=王五, description=北京市 篮球两年半, age=15}
{name=李四, description=山东省 游泳健身, age=18}
{name=富贵, description=天津市 游泳打球, age=22}
{name=来福, description=安徽合肥 职业代练, age=8}
{name=憨憨, description=北京市 健身打球, age=27}
{name=小七, description=北京市 游泳, age=31}

DSL: 查询description以 "南"开头的数据

GET  dsl_index/_search
{
  "query": {
    "prefix": {
      "description": {
        "value": "南"
      }
    }
  }
}
返回数据如下:
{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "dsl_index",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "name" : "张三",
          "age" : 11,
          "description" : "南京市 羽毛球爱好者"
        }
      }
    ]
  }
}

springboot实现:

@RequestMapping(value = "/prefix", method = RequestMethod.GET)
    @ApiOperation(value = "DSL - prefix")
    public void prefix() throws Exception {
        // 定义请求对象
        SearchRequest searchRequest = new SearchRequest(INDEX_NAME);
        // 查询
        searchRequest.source(new SearchSourceBuilder().query(QueryBuilders.prefixQuery("description","南")));
        // 打印返回数据
        printLog(client.search(searchRequest, RequestOptions.DEFAULT));
    }

返回数据如下:
{name=张三, description=南京市 羽毛球爱好者, age=11}

疑问: 查询description以 “南” 开头有数据返回,查询 “南京” 开头的无数据返回

DSL:

GET  dsl_index/_search
{
  "query": {
    "prefix": {
      "description": {
        "value": "南京"
      }
    }
  }
}

返回结果如下:
{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}

原因: 这就是上文说到的,prefix的原理是不对查询的数据分词,用查询的数据直接遍历所有倒排索引
而我们在创建索引的时候,description字段使用的是默认分词器 “standard”,该分词器会把输入的每个字符单独拆分,比方说把 "南京市 " 拆分成 “南”,“京”,“市”,顾而通过 “南” 字符去匹配能查询到数据,而通过 “南京” 去匹配无数据

分词器 “standard” 解析 “南京市” 如下:

POST _analyze
{
  "analyzer": "standard",
  "text": ["南京市"]
}

结果如下:
{
  "tokens" : [
    {
      "token" : "南",
      "start_offset" : 0,
      "end_offset" : 1,
      "type" : "<IDEOGRAPHIC>",
      "position" : 0
    },
    {
      "token" : "京",
      "start_offset" : 1,
      "end_offset" : 2,
      "type" : "<IDEOGRAPHIC>",
      "position" : 1
    },
    {
      "token" : "市",
      "start_offset" : 2,
      "end_offset" : 3,
      "type" : "<IDEOGRAPHIC>",
      "position" : 2
    }
  ]
}

那么我们如果就想通过 “南京” 去搜索到数据如何实现呢?可以在创建索引的时候,指定字段分词器为
“ik_max_word”,该分词器划分粒度比较细,当然也可以指定其它分词器或者自定义分词器,看场景需要,如下是 “ik_max_word” 分词器解析 “南京市” :

POST _analyze
{
  "analyzer": "ik_max_word",
  "text": ["南京市"]
}

结果如下:
{
  "tokens" : [
    {
      "token" : "南京市",
      "start_offset" : 0,
      "end_offset" : 3,
      "type" : "CN_WORD",
      "position" : 0
    },
    {
      "token" : "南京",
      "start_offset" : 0,
      "end_offset" : 2,
      "type" : "CN_WORD",
      "position" : 1
    },
    {
      "token" : "市",
      "start_offset" : 2,
      "end_offset" : 3,
      "type" : "CN_CHAR",
      "position" : 2
    }
  ]
}

Wildcard Query

通配符查询:工作原理和prefix相同,只不过它不是只比较开头,它能支持更为复杂的匹配模式。
注意:其实无论是前缀匹配还是通配符查询,针对的都是倒排索引。

DSL: 查询当前索引下description字段中含有 “篮” 的数据

GET  dsl_index/_search
{
  "query": {
    "prefix": {
      "description": {
        "value": "篮"
      }
    }
  }
}

返回结果如下:
{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "dsl_index",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "name" : "王五",
          "age" : 15,
          "description" : "北京市 篮球两年半"
        }
      }
    ]
  }
}

springboot实现:

@RequestMapping(value = "/wildcard", method = RequestMethod.GET)
    @ApiOperation(value = "DSL - wildcard")
    public void wildcard() throws Exception {
        // 定义请求对象
        SearchRequest searchRequest = new SearchRequest(INDEX_NAME);
        // 查询
        searchRequest.source(new SearchSourceBuilder().query(QueryBuilders.wildcardQuery("description","篮")));
        // 打印返回数据
        printLog(client.search(searchRequest, RequestOptions.DEFAULT));
    }

返回结果如下:
{name=王五, description=北京市 篮球两年半, age=15}

Fuzzy Query

模糊查询:

在实际的搜索中,我们有时候会打错字,从而导致搜索不到。在Elasticsearch中,我们可以使用fuzziness属性来进行模糊查询,从而达到搜索有错别字的情形。

fuzzy 查询会用到两个很重要的参数,fuzziness,prefix_length

  • fuzziness:表示输入的关键字通过几次操作可以转变成为ES库里面的对应field的字段
  • 操作是指:新增一个字符,删除一个字符,修改一个字符,每次操作可以记做编辑距离为1;
  • 如中文集团到中威集团编辑距离就是1,只需要修改一个字符;如果fuzziness值在这里设置成2,会把编辑距离为2的东东集团也查出来。
  • 该参数默认值为0,即不开启模糊查询; fuzzy 模糊查询 最大模糊错误必须在0-2之间
  • prefix_length:表示限制输入关键字和ES对应查询field的内容开头的第n个字符必须完全匹配,不允许错别字匹配;
  • 如这里等于1,则表示开头的字必须匹配,不匹配则不返回;
  • 默认值也是0;
  • 加大prefix_length的值可以提高效率和准确率。

DSL: 此时我们索引中有以下的数据:{name=王五, description=北京市 篮球两年半, age=15}
我们搜索description字段值为 “北京市 足球两年半” 故意输错一个字符

1.当设置fuzziness=0时:
GET  dsl_index/_search
{
  "query": {
    "fuzzy": {
      "description.keyword": {
        "value": "北京市 足球两年半",
        "fuzziness": 0
      }
    }
  }
}
无数据返回,如下:
{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}

2.当设置fuzziness=1时:
GET  dsl_index/_search
{
  "query": {
    "fuzzy": {
      "description.keyword": {
        "value": "北京市 足球两年半",
        "fuzziness": 1
      }
    }
  }
}

返回数据如下:
{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.4879789,
    "hits" : [
      {
        "_index" : "dsl_index",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.4879789,
        "_source" : {
          "name" : "王五",
          "age" : 15,
          "description" : "北京市 篮球两年半"
        }
      }
    ]
  }
}

springboot实现:

搜索description字段值为 "北京市 足球两年半",并且可以错一个字符:
    @RequestMapping(value = "/fuzzy", method = RequestMethod.GET)
    @ApiOperation(value = "DSL - fuzzy")
    public void fuzzy() throws Exception {
        // 定义请求对象
        SearchRequest searchRequest = new SearchRequest(INDEX_NAME);
        // 查询
        searchRequest.source(new SearchSourceBuilder().query(
                QueryBuilders.fuzzyQuery("description.keyword","北京市 篮球2年半").fuzziness(Fuzziness.ONE)));
        // 打印返回数据
        printLog(client.search(searchRequest, RequestOptions.DEFAULT));
    }

返回数据如下:
{name=王五, description=北京市 篮球两年半, age=15}

以上就是Query DSL术语级别查询并结合springboot的使用方法,后期继续介绍全文检索结合springboot使用。