elasticsearch之mappings parameters

转载

mb5fca0c87ea3a4 2020-06-27 15:33:00

文章标签 字段 elasticsearch 字符串返回结果 lucene 文章分类 代码人生

ignore_above

ignore_above#top

长度超过ignore_above设置的字符串将不会被索引或存储（个人认为会存储，但不会为该字段建立索引，也就是该字段不能被检索）。对于字符串数组，ignore_above将分别应用于每个数组元素，并且不会索引或存储比ignore_above更长的字符串元素。

Copy

PUT w1
{
  "mappings": {
    "doc":{
      "properties":{
        "t1":{
          "type":"keyword",
          "ignore_above": 5
        },
        "t2":{
          "type":"keyword",
          "ignore_above": 10   ①
        }
      }
    }
  }
}
PUT w1/doc/1
{
  "t1":"elk",          ②
  "t2":"elasticsearch"  ③
}
GET w1/doc/_search   ④
{
  "query":{
    "term": {
      "t1": "elk"
    }
  }
}

GET w1/doc/_search  ⑤
{
  "query": {
    "term": {
      "t2": "elasticsearch"
    }
  }
}

①，该字段将忽略任何超过10个字符的字符串。
②，此文档已成功建立索引，也就是说能被查询，并且有结果返回。
③，该字段将不会建立索引，也就是说，以该字段作为查询条件，将不会有结果返回。
④，有结果返回。
⑤，则将不会有结果返回，因为t2字段对应的值长度超过了ignove_above设置的值。

该参数对于防止Lucene的术语字节长度限制也很有用，限制长度是32766。
注意，该ignore_above设置可以利用现有的领域进行更新PUT地图API。
对于值ignore_above是字符数，但Lucene的字节数为单位。如果您使用带有许多非ASCII字符的UTF-8文本，您可能需要设置限制，32766 / 4 = 8191因为UTF-8字符最多可占用4个字节。
如果我们观察上述示例中，我们可以看到在设置映射类型时，字段的类型是keyword，也就是说ignore_above参数仅针对于keyword类型有用。
那么如果字符串的类型是text时能用ignore_above吗，答案是能，但要特殊设置：

Copy

PUT w2
{
  "mappings": {
    "doc":{
      "properties":{
        "t1":{
          "type":"keyword",
          "ignore_above":5
        },
        "t2":{
          "type":"text",
          "fields":{
            "keyword":{
              "type":"keyword",
              "ignore_above": 10
            }
          }
        }
      }
    }
  }
}

PUT w2/doc/1
{
  "t1":"beautiful",
  "t2":"beautiful girl"
}

GET w2/doc/_search  ①
{
  "query": {
    "term": {
      "t1": {
        "value": "beautiful"
      }
    }
  }
}

GET w2/doc/_search  ②
{
  "query": {
    "term": {
      "t2": "beautiful"
    }
  }
}

①，不会有返回结果。
②，有返回结果，因为该字段的类型是text。
但是，当字段类型设置为text之后，ignore_above参数的限制就失效了。

欢迎斧正，that's all see also：[官网7.0：ignore_above](https://www.elastic.co/guide/en/elasticsearch/reference/7.0/ignore-above.html) | [ignore_above](https://www.elastic.co/guide/en/elasticsearch/reference/7.0/ignore-above.html)

ignore_above

ignore_above#top

Copy

PUT w1
{
  "mappings": {
    "doc":{
      "properties":{
        "t1":{
          "type":"keyword",
          "ignore_above": 5
        },
        "t2":{
          "type":"keyword",
          "ignore_above": 10   ①
        }
      }
    }
  }
}
PUT w1/doc/1
{
  "t1":"elk",          ②
  "t2":"elasticsearch"  ③
}
GET w1/doc/_search   ④
{
  "query":{
    "term": {
      "t1": "elk"
    }
  }
}

GET w1/doc/_search  ⑤
{
  "query": {
    "term": {
      "t2": "elasticsearch"
    }
  }
}

Copy

PUT w2
{
  "mappings": {
    "doc":{
      "properties":{
        "t1":{
          "type":"keyword",
          "ignore_above":5
        },
        "t2":{
          "type":"text",
          "fields":{
            "keyword":{
              "type":"keyword",
              "ignore_above": 10
            }
          }
        }
      }
    }
  }
}

PUT w2/doc/1
{
  "t1":"beautiful",
  "t2":"beautiful girl"
}

GET w2/doc/_search  ①
{
  "query": {
    "term": {
      "t1": {
        "value": "beautiful"
      }
    }
  }
}

GET w2/doc/_search  ②
{
  "query": {
    "term": {
      "t2": "beautiful"
    }
  }
}