ES trem查询无结果 es查询结果不一致

转载

mob6454cc6ff2b9 2024-07-10 20:03:48

文章标签 ES trem查询无结果 elasticsearch 搜索引擎大数据字段 文章分类 机器学习人工智能

1 数据准备

PUT student_index
{
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 0
  },
  "mappings": {
    "properties": {
      "birthday": {
        "type": "date",
        "format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis"
      },
      "name": {
        "type": "text",
        "fields": {
          "keyword": {
            "type": "keyword",
            "ignore_above": 256
          }
        }
      },
      "age": {
        "type": "integer"
      },
      "desc": {
        "type": "text"
      },
      "nationality": {
        "type": "keyword"
      }
    }
  }
}

POST /student_index/_doc
{
  "name":"Kobe Bryant",
  "age":18,
  "desc":"superstar",
  "birthday":"1978-08-23",
  "nationality":"America"
  
}
POST /student_index/_doc
{
  "name":"LeBron James",
  "age":18,
  "desc":"superstar",
  "birthday":"1984-12-30",
  "nationality":"America"
  
}
POST /student_index/_doc
{
  "name":"Michael Jordan",
  "age":18,
  "desc":"superstar",
  "birthday":"1963-02-17",
  "nationality":"America"
  
}
POST /student_index/_doc
{
  "name":"James Harden",
  "age":18,
  "desc":"superstar",
  "birthday":"1989-08-26",
  "nationality":"America"
}
POST /student_index/_doc
{
  "name":"姚明",
  "age":18,
  "desc":"中国篮球杰出贡献奖",
  "birthday":"1980-09-12",
  "nationality":"China"
}
POST /student_index/_doc
{
  "name":"Giannis Antetokounmpo",
  "age":18,
  "desc":"superstar",
  "birthday":"1994-12-06",
  "nationality":"The Greek"
}

2 match

match查询属于高层查询，会根据你查询的字段的类型不一致，采用不同的查询方式。

如果查询的是日期或者数值的字段，他会自动将你的字符串查询内容转换成日期或者数值对待。
如果查询的内容是一个不能被分词的字段(keyword)，match查询不会对你的指定查询关键字进行分词。
如果查询的内容是一个可以分词的字段(text)，match会将你指定的查询内容根据一定的方式去分词，然后去分词库中匹配指定的内容。(分词打分)

# text类型分词查询
GET student_index/_search
{
  "query": {
    "match": {
      "name": "James"
    }
  }
}

#keyword查询
GET student_index/_search
{
  "query": {
    "match": {
      "nationality": "The Greek"
    }
  }
}

3 term（精准匹配）

term的查询是代表完全匹配，搜索之前不会对你搜索的关键字进行分词，如关键字手机，不会分成手和机，再根据关键字去文档分词库中去匹配内容。

#精准匹配查询keyword的值
GET student_index/_search
 {
   "query": {
     "term": {
       "nationality": {
         "value": "The Greek"
       }
     }
   }
 }

ES trem查询无结果 es查询结果不一致_ES trem查询无结果

#精准匹配text的值
GET student_index/_search
 {
   "query": {
     "term": {
       "name": {
         "value": "Michael Jordan"
       }
     }
   }
 }

ES trem查询无结果 es查询结果不一致_elasticsearch_02

GET _analyze
 {
   "analyzer": "standard",
   "text":"Michael Jordan"
 }
 
 
 GET /student_index/_search
 {
   "query": {
     "term": {
       "name": {
         "value": "michael"
       }
     }
   }
 }

ES trem查询无结果 es查询结果不一致_ES trem查询无结果_03

ES trem查询无结果 es查询结果不一致_搜索引擎_04

4 filter

filter，根据你的查询条件去查询文档，不去计算分数，而且filter会对经常被过滤的数据进行缓存，方便下次快速定位查询；
由于filter不计算分数，所以性能优于计算分数的查询。

#查询年龄在1985-2022年之间的学生
GET /student_index/_search
 {
   "query": {
     "bool": {
        "filter": {
          "range": {
            "birthday": {
              "gte": "1985-01-01",
              "lte": "2022-01-01"
            }
          }
        }
     }
   }
 }

ES trem查询无结果 es查询结果不一致_elasticsearch_05

#查询年龄在1985-2022年之间的supterstar
GET /student_index/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "desc": "superstar"
          }
        }
      ],
      "filter": {
        "range": {
          "birthday": {
            "gte": "1985-01-01",
            "lte": "2022-01-01"
          }
        }
      }
    }
  }
}

ES trem查询无结果 es查询结果不一致_搜索引擎_06

5 bool查询

复合过滤器，将你的多个查询条件，以一定的逻辑组合在一起：

must：返回的文档必须满足must子句的条件，并且参与计算分值。
filter：返回的文档必须满足filter子句的条件，不计算相关得分。
should: 可能满足or子句（查询）应出现在匹配的文档中。
must_not：返回的文档必须不满足must_not定义的条件。

#查询年龄在1985-2022，国籍不是美国的superstart
GET /student_index/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "desc": "superstar"
          }
        }
      ],
      "must_not": [
        {
          "term": {
            "nationality": {
              "value": "America"
            }
          }
        }
      ], 
      "filter": {
        "range": {
          "birthday": {
            "gte": "1985-01-01",
            "lte": "2022-01-01"
          }
        }
      }
    }
  }
}

ES trem查询无结果 es查询结果不一致_字段_07

6 wildcard通配符模糊查询

概念：通配符运算符是匹配一个或多个字符的占位符。

？：匹配任意字符
*：匹配多个字符

注意：通配符匹配是字典匹配，加了keyword以后匹配是值匹配。

GET student_index/_search
{
  
   "query": {
     "wildcard": {
       "name": {
         "value": "jam*s"
       }
     }
   }
}
GET student_index/_search
{
  
   "query": {
     "wildcard": {
       "name.keyword": {
         "value": "jam*s"
       }
     }
   }
}

GET student_index/_search
{
  
   "query": {
     "wildcard": {
       "name.keyword": {
         "value": "姚*"
       }
     }
   }
}
GET student_index/_search
{
  
   "query": {
     "wildcard": {
       "name.keyword": {
         "value": "LeBron Jam*s"
       }
     }
   }
}

7 fuzzy

模糊查询，我们输入字符的大概，ES就可以根据输入的内容去大概匹配一下结果，
同时也支持输入关键字的错别字，所以fuzzy查询本身相对不太精确和稳定，即错别字太多也可能导致查询无结果，需要则中使用。

fuzziness: 编辑距离，（0,1,2）并非越大越好，召回率高但结果不准确。可以设置成AUTO，ES会根据关键字的自动设置fuzziness。如果不设置fuzziness想当设置成ATUO。

match也支持fuzzy，区别是match分词，fuzzy不分词

GET student_index/_search
{
  
   "query": {
     "fuzzy": {
       "name": {
         "value": "James",
         "fuzziness": 1
       }
     }
   }
}


GET student_index/_search
{
  
   "query": {
     "fuzzy": {
       "name": {
         "value": "James",
         "fuzziness": "AUTO"
       }
     }
   }
}

GET student_index/_search
{
  "query": {
    "match": {
      "name": {
        "query": "LeBron Jjmes",
        "fuzziness": 1
      }
    }
  }
}

本文章为转载内容，我们尊重原作者对文章享有的著作权。如有内容错误或侵权问题，欢迎原作者联系我们进行内容更正或删除文章。