1.词项查询介绍

全文查询将在执行之前分析查询字符串,但词项级别查询将按照存储在倒排索引中的词项进行精确操作。这些查询通常用于数字,日期和枚举等结构化数据,而不是全文本字段。 或者,它们允许您制作低级查询,并在分析过程之前进行。

2.term查询

term查询用于词项搜索,前一章已经介绍过这里不再重复。

3.terms查询

term查询对于查找单个值非常有用,但通常我们可能想搜索多个值。我们只要用单个 terms 查询(注意末尾的 s ), terms 查询好比是 term 查询的复数形式(以英语名词的单复数做比)。

如下查询”title“中包含”河北“,”长生“,”碧桂园“三个词组。

GET telegraph/_search
{
  "query": {
    "terms": {
      "title": ["河北","长生","碧桂园"]
    }
  }
}
{
  "took": 7,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 1,
    "hits": [
      {
        "_index": "telegraph",
        "_type": "msg",
        "_id": "A5etp2QBW8hrYY3zGJk7",
        "_score": 1,
        "_source": {
          "title": "碧桂园集团副主席杨惠妍",
          "content": "杨惠妍分别于7月10日、11日买入碧桂园1000万股、1500万股",
          "author": "小财注",
          "pubdate": "2018-07-17T16:12:55"
        }
      },
      {
        "_index": "telegraph",
        "_type": "msg",
        "_id": "Apetp2QBW8hrYY3zGJk7",
        "_score": 1,
        "_source": {
          "title": "长生生物再次跌停 三机构抛售近1000万元",
          "content": "长生生物再次一字跌停,报收19.89元,成交1432万元",
          "author": "长生生物",
          "pubdate": "2018-07-17T10:03:11"
        }
      },
      {
        "_index": "telegraph",
        "_type": "msg",
        "_id": "BJetp2QBW8hrYY3zGJk7",
        "_score": 1,
        "_source": {
          "title": "河北聚焦十大行业推进国际产能合作",
          "content": "河北省政府近日出台积极参与“一带一路”建设推进国际产能合作实施方案",
          "author": "财联社",
          "pubdate": "2018-07-17T14:14:55"
        }
      }
    ]
  }
}

4. terms_set查询

查找与一个或多个指定词项匹配的文档,其中必须匹配的术语数量取决于指定的最小值,应匹配字段或脚本。

5.range查询

range查询用于匹配数值型、日期型或字符串型字段在某一范围内的文档。

日期类型范围查询

上面例子查询发布时间“pubdate”在“2018-07-17T12:00:00”和“2018-07-17T16:30:00”之间的文档数据。

GET telegraph/_search
{
  "query": {
    "range": {
      "pubdate": {
        "gte": "2018-07-17T12:00:00",
        "lte": "2018-07-17T16:30:00"
      }
    }
  }
}

查询结果

{
  "took": 6,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 1,
    "hits": [
      {
        "_index": "telegraph",
        "_type": "msg",
        "_id": "AZetp2QBW8hrYY3zGJk7",
        "_score": 1,
        "_source": {
          "title": "周五召开董事会会议 审议及批准更新后的一季报",
          "content": "以审议及批准更新后的2018年第一季度报告",
          "author": "中兴通讯",
          "pubdate": "2018-07-17T12:33:11"
        }
      },
      {
        "_index": "telegraph",
        "_type": "msg",
        "_id": "A5etp2QBW8hrYY3zGJk7",
        "_score": 1,
        "_source": {
          "title": "碧桂园集团副主席杨惠妍",
          "content": "杨惠妍分别于7月10日、11日买入碧桂园1000万股、1500万股",
          "author": "小财注",
          "pubdate": "2018-07-17T16:12:55"
        }
      },
      {
        "_index": "telegraph",
        "_type": "msg",
        "_id": "BJetp2QBW8hrYY3zGJk7",
        "_score": 1,
        "_source": {
          "title": "河北聚焦十大行业推进国际产能合作",
          "content": "河北省政府近日出台积极参与“一带一路”建设推进国际产能合作实施方案",
          "author": "财联社",
          "pubdate": "2018-07-17T14:14:55"
        }
      }
    ]
  }
}

数值类型范围查询

新建索引添加数据

DELETE my_person

PUT my_person

PUT my_person/stu/1
{
  "name":"sean",
  "age":20
}

PUT my_person/stu/2
{
  "name":"sum",
  "age":25
}

PUT  my_person/stu/3
{
  "name":"dean",
  "age":30
}

PUT my_person/stu/4
{
  "name":"kastel",
  "age":35
}

查询“age”范围在20到30之间的人员

GET my_person/_search
{
  "query": {
    "range": {
      "age": {
        "gte": 20,
        "lte": 30
      }
    }
  }
}

查询结果

{
  "took": 6,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 1,
    "hits": [
      {
        "_index": "my_person",
        "_type": "stu",
        "_id": "2",
        "_score": 1,
        "_source": {
          "name": "sum",
          "age": 25
        }
      },
      {
        "_index": "my_person",
        "_type": "stu",
        "_id": "1",
        "_score": 1,
        "_source": {
          "name": "sean",
          "age": 20
        }
      },
      {
        "_index": "my_person",
        "_type": "stu",
        "_id": "3",
        "_score": 1,
        "_source": {
          "name": "dean",
          "age": 30
        }
      }
    ]
  }
}

6.exists查询

查询文档中的字段至少包含一个非空值。

创建索引添加数据

DELETE my_person

PUT my_person

PUT my_person/stu/1
{
  "name":"sean",
  "hobby":"running"
}

PUT my_person/stu/2
{
  "name":"Jhon",
  "hobby":""
}

PUT my_person/stu/3
{
  "name":"sum",
  "hobby":["swimming",null]
}

PUT my_person/stu/4
{
  "name":"lily",
  "hobby":[null,null]
}

PUT my_person/stu/5
{
  "name":"lucy"
}

查询“hobby”不为空的文档

GET my_person/_search
{
  "query": {
    "exists":{
      "field":"hobby"
    }
  }
}

查询结果

{
  "took": 12,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 1,
    "hits": [
      {
        "_index": "my_person",
        "_type": "stu",
        "_id": "2",
        "_score": 1,
        "_source": {
          "name": "Jhon",
          "hobby": ""
        }
      },
      {
        "_index": "my_person",
        "_type": "stu",
        "_id": "1",
        "_score": 1,
        "_source": {
          "name": "sean",
          "hobby": "running"
        }
      },
      {
        "_index": "my_person",
        "_type": "stu",
        "_id": "3",
        "_score": 1,
        "_source": {
          "name": "sum",
          "hobby": [
            "swimming",
            null
          ]
        }
      }
    ]
  }
}

匹配说明:

    • "hobby":"running"------值不为空(可以匹配)
    • "hobby":""------值为空字符串,不是空值(可以匹配)
    • "hobby":["swimming",null]------数组中有非空值(可以匹配)
    • "hobby":[null,null]------数组中值都为null(不可以匹配)
    • "name":"lucy"------没有hobby字段(不可以匹配)

    7.prefix查询

    查询以匹配字符串开头的文档,如下查询”hobby“中以”sw“开头的文档

    GET my_person/_search
    {
      "query": {
        "prefix": {
          "hobby": {
            "value": "sw"
          }
        }
      }
    }

    查询结果

    {
      "took": 11,
      "timed_out": false,
      "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
      },
      "hits": {
        "total": 2,
        "max_score": 1,
        "hits": [
          {
            "_index": "my_person",
            "_type": "stu",
            "_id": "6",
            "_score": 1,
            "_source": {
              "name": "deak",
              "hobby": "swimming"
            }
          },
          {
            "_index": "my_person",
            "_type": "stu",
            "_id": "3",
            "_score": 1,
            "_source": {
              "name": "sum",
              "hobby": [
                "swimming",
                null
              ]
            }
          }
        ]
      }
    }

    8.wildcard查询

    通配符查询,如下查询hobby匹配”*ing“的文档

    GET my_person/_search
    {
      "query": {
        "wildcard": {
          "hobby": {
            "value": "*ing"
          }
        }
      }
    }

    查询结果

    {
      "took": 27,
      "timed_out": false,
      "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
      },
      "hits": {
        "total": 3,
        "max_score": 1,
        "hits": [
          {
            "_index": "my_person",
            "_type": "stu",
            "_id": "6",
            "_score": 1,
            "_source": {
              "name": "deak",
              "hobby": "swimming"
            }
          },
          {
            "_index": "my_person",
            "_type": "stu",
            "_id": "1",
            "_score": 1,
            "_source": {
              "name": "sean",
              "hobby": "running"
            }
          },
          {
            "_index": "my_person",
            "_type": "stu",
            "_id": "3",
            "_score": 1,
            "_source": {
              "name": "sum",
              "hobby": [
                "swimming",
                null
              ]
            }
          }
        ]
      }
    }

    9.regexp查询

    正则表达式查询的性能很大程度上取决于所选的正则表达式。 类似.*的匹配任何内容的正则表达式非常缓慢,并且使用了lookaround正则表达式。 如果可以的话,请尝试在正则表达式开始之前使用长前缀。 像.*?+这样的通配符匹配器大多会降低性能。大多数正则表达式引擎允许您匹配字符串的任何部分。 如果你想让正则表达式模式从字符串的开头开始,或者在字符串的末尾完成,那么你必须明确地定位它,使用^表示开始或$表示结束。

    元字符

    语义

    说明

    例子

    .

    Match any character

    The period “.” can be used to represent any character

    匹配任何一个字符

    ab.匹配abc、ab1

    +

    One-or-more

    The plus sign “+” can be used to repeat the preceding shortest pattern once or more times.

    加号“+”可以用来重复上一个最短的模式一次或多次。

    “aaabbb”匹配a+b+

    *

    Zero-or-more

    The asterisk “*” can be used to match the preceding shortest pattern zero-or-more times.

    “aaabbb”匹配a*b*

    ?

    Zero-or-one

    The question mark “?” makes the preceding shortest pattern optional. It matches zero or one times.

    “aaabbb”匹配aaa?bbbb?

    {m},{m,n}

    Min-to-max

    Curly brackets “{}” can be used to specify a minimum and (optionally) a maximum number of times the preceding shortest pattern can repeat.

    “aaabbb”匹配a{3}b{3}和a{2,4}b{2,4}

    ()

    Grouping

    Parentheses “()” can be used to form sub-patterns.

    “ababab”匹配(ab)+

    |

    Alternation

    The pipe symbol “|” acts as an OR operator.

    “aabb”匹配aabb|bbaa

    []

    Character classes

    Ranges of potential characters may be represented as character classes by enclosing them in square brackets “[]”. A leading ^ negates the character class.

    [abc]匹配 ‘a’ or ‘b’ or ‘c’

    ~

    Complement

    The shortest pattern that follows a tilde “~” is negated(否定).“ab~cd”的意思是:以a开头,后跟b,后面跟一个任意长度的字符串,但不是c,以d结尾

    “abcdef”匹配ab~df或a~(cb)def,不匹配ab~cdef和a~(bc)def

    <>

    Interval间隔

    The interval option enables the use of numeric ranges, enclosed by angle brackets “<>”.

    “foo80”匹配foo<1-100>

    &

    Intersection

    The ampersand “&” joins two patterns in a way that both of them have to match.

    “aaabbb”匹配aaa.+&.+bbb

    @

    Any string

    The at sign “@” matches any string in its entirety.

    @&~(foo.+)匹配除了以“foo”开头的字符串 “foo”

    查询”hobby“字段值与”sw.+“正则匹配的文档

    GET my_person/_search
    {
      "query": {
        "regexp":{
          "hobby":"sw.+"
        }
      }
    }

    查询结果

    {
      "took": 5,
      "timed_out": false,
      "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
      },
      "hits": {
        "total": 2,
        "max_score": 1,
        "hits": [
          {
            "_index": "my_person",
            "_type": "stu",
            "_id": "6",
            "_score": 1,
            "_source": {
              "name": "deak",
              "hobby": "swimming"
            }
          },
          {
            "_index": "my_person",
            "_type": "stu",
            "_id": "3",
            "_score": 1,
            "_source": {
              "name": "sum",
              "hobby": [
                "swimming",
                null
              ]
            }
          }
        ]
      }
    }

    10.fuzzy查询

    模糊查询

    GET telegraph/_search
    {
      "query": {
        "fuzzy": {
          "title": "十大"
        }
      }
    }

    查询结果

    {
      "took": 4,
      "timed_out": false,
      "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
      },
      "hits": {
        "total": 1,
        "max_score": 0.99277425,
        "hits": [
          {
            "_index": "telegraph",
            "_type": "msg",
            "_id": "BJetp2QBW8hrYY3zGJk7",
            "_score": 0.99277425,
            "_source": {
              "title": "河北聚焦十大行业推进国际产能合作",
              "content": "河北省政府近日出台积极参与“一带一路”建设推进国际产能合作实施方案",
              "author": "财联社",
              "pubdate": "2018-07-17T14:14:55"
            }
          }
        ]
      }
    }

    11.ids查询

    根据跟定的文档id列表查询文档。

    GET my_person/_search
    {
      "query": {
        "ids": {
          "values": ["1","3","5"]
        }
      }
    }

    查询结果

    {
      "took": 4,
      "timed_out": false,
      "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
      },
      "hits": {
        "total": 3,
        "max_score": 1,
        "hits": [
          {
            "_index": "my_person",
            "_type": "stu",
            "_id": "5",
            "_score": 1,
            "_source": {
              "name": "lucy"
            }
          },
          {
            "_index": "my_person",
            "_type": "stu",
            "_id": "1",
            "_score": 1,
            "_source": {
              "name": "sean",
              "hobby": "running"
            }
          },
          {
            "_index": "my_person",
            "_type": "stu",
            "_id": "3",
            "_score": 1,
            "_source": {
              "name": "sum",
              "hobby": [
                "swimming",
                null
              ]
            }
          }
        ]
      }
    }