文章目录

  • 搜索和查询
  • 查询的上下文
  • Query DSL(Domain Specific Language)
  • 1 查询上下文
  • 2 相关度评分:_score
  • 3 元数据:_source
  • 4 Query String
  • 查询所有:
  • 带参数:
  • 分页:
  • 精准匹配 exact value
  • _all搜索 相当于在所有有索引的字段中检索
  • 5 全文检索-Fulltext query
  • match:匹配包含某个term的子句
  • match_all:匹配所有结果的子句
  • multi_match:多字段条件
  • match_phrase:短语查询,
  • 6 精准查询-Term query
  • term:匹配和搜索词项完全相等的结果
  • terms:匹配和搜索词项列表中任意项匹配的结果
  • range:范围查找
  • 7 过滤器-Filter
  • 8 组合查询-Bool query
  • 数据代码


搜索和查询

查询的上下文

{
	  #请求消耗的时间 ms
	  "took" : 722,
	  #是否超时
	  "timed_out" : false,
	  #当前请求的分片
	  "_shards" : {
	  	#分片的总数
	    "total" : 1,
	    #成功的个数
	    "successful" : 1,
	    #跳过的个数
	    "skipped" : 0,
	    #失败的个数
	    "failed" : 0
	  },
	  #真正返回的结果
	  "hits" : {
	  	#结果的总数
	    "total" : {
	      #查询到的结果个数,不是返回显示的个数
	      "value" : 2,
	      #查询关系
	      "relation" : "eq"
	    },
	    #当前最大的评分
	    "max_score" : 1.0,
	    #具体的结果
	    "hits" : [
	      {
	      	#索引
	        "_index" : "product",
	        #类型
	        "_type" : "_doc",
	        #id
	        "_id" : "1",
	        #相关度评分
	        "_score" : 1.0,
	        #具体的结果详情,元数据
	        "_source" : {
	          "name" : "xiami phone",
	          "desc" : "shouji zhong de jianjiji",
	          "price" : 4999,
	          "tags" : [
	            "xingjiabi",
	            "fashao",
	            "buka"
	          ]
	        }
	      },
	      {
	        "_index" : "product",
	        "_type" : "_doc",
	        "_id" : "2",
	        "_score" : 1.0,
	        "_source" : {
	          "name" : "xiami nfc phone",
	          "desc" : "zhichi quangongneng nfc,shouji zhong de jianjiji",
	          "price" : 49995,
	          "tags" : [
	            "xingjiabi",
	            "fashao",
	            "gongjiaoka"
	          ]
	        }
	      }
	    ]
	  }
	}

Query DSL(Domain Specific Language)

1 查询上下文

使用query关键字进行检索,倾向于相关度搜索,故需要计算评分。搜索是Elasticsearch最关键和重要的部分。

2 相关度评分:_score

概念:相关度评分用于对搜索结果排序,评分越高则认为其结果和搜索的预期值相关度越高,即越符合搜索预期值。在7.x之前相关度评分默认使用TF/IDF算法计算而来,7.x之后默认为BM25。在核心知识篇不必关心相关评分的具体原理,只需知晓其概念即可。

排序:相关度评分为搜索结果的排序依据,默认情况下评分越高,则结果越靠前。

3 元数据:_source

  1. 禁用_source:
  1. 好处:节省存储开销
  2. 坏处:
  • 不支持update、update_by_query和reindex API。
  • 不支持高亮。
  • 不支持reindex、更改mapping分析器和版本升级。
  • 通过查看索引时使用的原始文档来调试查询或聚合的功能。
  • 将来有可能自动修复索引损坏。
GET product2/_search
{
 "_source": ["owner.name","owner.sex"], 
 "query": {
   "match_all": {}
 }
}
**总结:如果只是为了节省磁盘,可以压缩索引比禁用_source更好。**
  1. 数据源过滤器:
    Including:结果中返回哪些field
    Excluding:结果中不要返回哪些field,不返回的field不代表不能通过该字段进行检索,因为元数据不存在不代表索引不存在
  1. 在mapping中定义过滤:支持通配符,但是这种方式不推荐,因为mapping不可变
PUT product
{
  "mappings": {
    "_source": {
      "includes": [
        "name",
        "price"
      ],
      "excludes": [
        "desc",
        "tags"
      ]
    }
  }
}
  1. 常用过滤规则
  • “_source”: “false”,
  • “_source”: “obj.*”,
  • “_source”: [ “obj1.*”, “obj2.*” ],
  • “_source”: {
    “includes”: [ “obj1.*”, “obj2.*” ],
    “excludes”: [ “*.description” ]
    }
GET product2/_search
{
  "_source": {
    "includes": [
      "owner.*",
      "name"
    ],
    "excludes": [
      "name", 
      "desc",
      "price"
    ]
  },
  "query": {
    "match_all": {}
  }

4 Query String

查询所有:

GET /product/_search

带参数:

GET /product/_search?q=name:xiaomi

分页:

GET /product/_search?from=0&size=2&sort=price:asc

精准匹配 exact value

GET /product/_search?q=date:2021-06-01

_all搜索 相当于在所有有索引的字段中检索

GET /product/_search?q=2021-06-01

DELETE product
# 验证_all搜索
PUT product
{
  "mappings": {
    "properties": {
      "desc": {
        "type": "text", 
        "index": false
      }
    }
  }
}
# 先初始化数据
POST /product/_update/5
{
  "doc": {
    "desc": "erji zhong de kendeji 2021-06-01"
  }
}

5 全文检索-Fulltext query

GET index/_search
{
  "query": {
    ***
  }
}
match:匹配包含某个term的子句
GET product/_search
		{
		  "query": {
		    "match": {
		      "name": "xiaomi nfc phone"
		    }
		  }
		}
match_all:匹配所有结果的子句
GET product/_search
{
  "query": {
    "match_all": {}
  }
}
multi_match:多字段条件
GET product/_search
{
  "query": {
    "multi_match": {
      "query": "phone huangmenji",
      "fields": ["name","desc"]
    }
  }
}
match_phrase:短语查询,
{
  "query": {
    "match_phrase": {
      "name": "xiaomi nfc"
    }
  }
}

6 精准查询-Term query

term:匹配和搜索词项完全相等的结果
  • term和match_phrase区别:
GET product/_search
{
  "query": {
    "match": {
      "name": "xiaomi phone"
    }
  }
}
GET product/_search
{
  "query": {
    "term": {
      "name": "xiaomi phone"
    }
  }
}
GET product/_search
{
  "query": {
    "match_phrase": {
      "name": "xiaomi phone"
    }
  }
}
match_phrase 会将检索关键词分词, match_phrase的分词结果必须在被检索字段的分词中都包含,而且顺序必须相同,而且默认必须都是连续的 

term搜索不会将搜索词分词
  • term和keyword区别
    term是对于搜索词不分词,
    keyword是字段类型,是对于source data中的字段值不分词
GET product/_search
{
  "query": {
    "term": {
      "name": "xiaomi phone"
    }
  }
}
GET product/_search
{
  "query": {
    "term": {
      "name.keyword": "xiaomi phone"
    }
  }
}
terms:匹配和搜索词项列表中任意项匹配的结果
GET product/_search
{
  "query": {
    "terms": {
      "tags": [ "lowbee", "gongjiaoka" ],
      "boost": 1.0
    }
  }
}
range:范围查找
GET /_search
{
  "query": {
    "range": {
      "age": {
        "gte": 10,
        "lte": 20,
        "boost": 2.0
      }
    }
  }
}
GET product/_search
{
  "query": {
    "range": {
      "date": {
        "gte": "2021-04-15",
        "lt": "2021-04-16"
      }
    }
  }
}

7 过滤器-Filter

GET _search
{
  "query": {
    "constant_score": {
      "filter": {
        "term": {
          "status": "active"
        }
      }
    }
  }
}
  • filter:query和filter的主要区别在: filter是结果导向的而query是过程导向。query倾向于“当前文档和查询的语句的相关度”而filter倾向于“当前文档和查询的条件是不是相符”。即在查询过程中,query是要对查询的每个结果计算相关性得分的,而filter不会。另外filter有相应的缓存机制,可以提高查询效率。
GET product/_search
{
  "query": {
    "constant_score": {
      "filter": {
        "term": {
          "name": "phone"
        }
      },
      "boost": 1.2
    }
  }
}
GET product/_search
{
  "query": {
    "bool": {
      "filter": {
        "term": {
          "name": "phone"
        }
      }
    }
  }
}

8 组合查询-Bool query

bool:可以组合多个查询条件,bool查询也是采用more_matches_is_better的机制,因此满足must和should子句的文档将会合并起来计算分值

  • must:必须满足子句(查询)必须出现在匹配的文档中,并将有助于得分。
# bool query 组合查询

#must 计算相关度得分
#条件1:包含"xiaomi"或"phone"
#条件2:包含"shouji zhong"
GET product/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "name": "xiaomi phone"
          }
        },
        {
          "match_phrase": {
            "desc": "shouji zhong"
          }
        }
      ]
    }
  }
}
  • filter:过滤器 不计算相关度分数,cache☆子句(查询)必须出现在匹配的文档中。但是不像 must查询的分数将被忽略。Filter子句在filter上下文中执行,这意味着计分被忽略,并且子句被考虑用于缓存。
#filter 不计算相关度得分
GET product/_search
{
  "query": {
    "bool": {
      "filter": [
        {
          "match": {
            "name": "xiaomi phone"
          }
        },
        {
          "match_phrase": {
            "desc": "shouji zhong"
          }
        }
      ]
    }
  }
}
  • should:可能满足 or子句(查询)应出现在匹配的文档中。
#should
GET product/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "match_phrase": {
            "name": "xiaomi nfc"
          }
        },
        {
          "range": {
            "price": {
              "lte": "500"
            }
          }
        }
      ]
    }
  }
}
  • must_not:必须不满足 不计算相关度分数 not子句(查询)不得出现在匹配的文档中。子句在过滤器上下文中执行,这意味着计分被忽略,并且子句被视为用于缓存。由于忽略计分,0因此将返回所有文档的分数。
#must not 不计算相关度得分
#条件1: 排除包含xiaomi的和包含nfc的(不能包含xiaomi和nfc中的任意一个)
#条件2: 排除价格大于等于500的
GET product/_search
{
  "query": {
    "bool": {
      "must_not": [
        {
          "match": {
            "name": "xiaomi nfc"
          }
        },
        {
          "range": {
            "price": {
              "gte": "500"
            }
          }
        }
      ]
    }
  }
}

minimum_should_match:参数指定should返回的文档必须匹配的子句的数量或百分比。如果bool查询包含至少一个should子句,而没有must或 filter子句,则默认值为1。否则,默认值为0

#filter和must组合
GET product/_search
{
  "_source": false, 
  "query": {
    "bool": {
      "filter": [
        {
          "range": {
            "price": {
              "lte": "1000"
            }
          }
        }
      ],
      "must": [
        {
          "match": {
            "name": "xiaomi"
          }
        }
      ]
    }
  }
}
GET product/_search
{
  "_source": false, 
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "name": "xiaomi"
          }
        },
        {
          "range": {
            "price": {
              "lte": "1000"
            }
          }
        }
      ]
    }
  }
}
#(must或者filter)和should组合
#条件1:价格小于10000
#条件2:name中包含"hongmi"或者"xiaomi nfc phone"
GET product/_search
{
  "_source": false,
  "query": {
    "bool": {
      "filter": [
        {
          "range": {
            "price": {
              "lte": "10000"
            }
          }
        }
      ],
      "should": [
        {
          "match_phrase": {
            "name": "nfc phone"
          }
        },
        {
          "match": {
            "name": "erji"
          }
        },
        {
          "bool": {
            "must": [
              {
                "range": {
                  "price": {
                    "gte": 900,
                    "lte": 3000
                  }
                }
              }
            ]
          }
        }
      ],
      "minimum_should_match": 2
    }
  }
}

数据代码

PUT /product/_doc/1
{
    "name" : "xiaomi phone",
    "desc" :  "shouji zhong de zhandouji",
    "date": "2021-06-01",
    "price" :  3999,
    "tags": [ "xingjiabi", "fashao", "buka" ]
}
PUT /product/_doc/2
{
    "name" : "xiaomi nfc phone",
    "desc" :  "zhichi quangongneng nfc,shouji zhong de jianjiji",
    "date": "2021-06-02",
    "price" :  4999,
    "tags": [ "xingjiabi", "fashao", "gongjiaoka" ]
}
PUT /product/_doc/3
{
    "name" : "nfc phone",
    "desc" :  "shouji zhong de hongzhaji",
    "date": "2021-06-03",
    "price" :  2999,
    "tags": [ "xingjiabi", "fashao", "menjinka" ]
}
PUT /product/_doc/4
{
    "name" : "xiaomi erji",
    "desc" :  "erji zhong de huangmenji",
    "date": "2021-04-15",
    "price" :  999,
    "tags": [ "low", "bufangshui", "yinzhicha" ]
}
PUT /product/_doc/5
{
    "name" : "hongmi erji",
    "desc" :  "erji zhong de kendeji 2021-06-01",
    "date": "2021-04-16",
    "price" :  399,
    "tags": [ "lowbee", "xuhangduan", "zhiliangx" ]
}
PUT /product2/_doc/1
{
  "owner":{
    "name":"zhangsan",
    "sex":"男",
    "age":18
  },
  "name": "hongmi erji",
  "desc": "erji zhong de kendeji",
  "price": 399,
  "tags": [
    "lowbee",
    "xuhangduan",
    "zhiliangx"
  ]
}
PUT product2
{
  "mappings": {
    "_source": {
      "includes": [
        "name",
        "price"
      ],
      "excludes": [
        "desc",
        "tags"
      ]
    }
  }
}