es7 根据id查询 es批量查询id

转载

mob64ca13f8eecb 2024-03-08 15:26:33

文章标签 es7 根据id查询 elasticsearch 学习大数据高亮 文章分类 架构后端开发

1、id和ids

ids

2、match查询

查询所有

查看分词效果

1、将《浙江省》进行中文分词《浙江》《浙江省》《省》2、将分词结果逐一匹配词条特点：先分词.，再拿词去匹配倒排索引

bool match：query的值不会被分词，直接匹配词条，默认and

query的值先分词，再匹配词条

multi_match：多域查询，query的值会分词，然后在多个域中匹配词条，只要其中一个域能匹配即可

关键字检索，可以使用match进行检索，因为match是先分词再匹配词条

3、term 查询

term：不分词直接匹配词条，按物品分类或者品牌这类检索，可以使用term检索

4、prefix 查询

prefix：词条以指定的value为前缀的

5、wildcard 查询

wildcard：不分词通配符的方式匹配词条，*指任意内容？指任意一个内容？？指任意两个内容

6、range 查询

范围查询，gte为大于等于，lte为小于等于

7、分页查询

from+size：先查询所有，from:从1开始查（从0角标开始） size:每页显示两条

8、复合查询

must:求交集，must 多个查询单元必须同时匹配

must not：取反，must not 多个查询单元必须都不匹配（address既不能等于“河北河南”，age既不能大于30小于36）

should：并集，多个查询单元满足其中一个条件即可

9、高亮查询

单个域的高亮

多个域的高亮

10、boosting查询

影响文档分数的因素:

1、当查询的关键字在文档出现的频次越高，分数越高

2、指定的文档内容越短，分数越高,如查找的是黄花鱼，指定文档内容就是黄花鱼

11、过滤

查询结果过滤掉不想要的文档

影响分数值score：编辑

过滤 filter是不影响分数的

12、排序

desc：降序 asc：升序

13、聚合

Elasticsearch中的聚合，包含多种类型，最常用的两种，一个叫度量（metrics）：，一个叫桶（bucket）

度量（metrics）

桶（bucket）

1、id和ids

id

GET es_user/_doc/1

ids

GET es_user/_search //批量查询
{
"query": {
"ids": {"values": [1,2,3]}
}
}

2、match查询

查询所有

GET es_user/_search
{
"query": {
"match_all": {}
}
}

查看分词效果

GET _analyze
{
"analyzer": "ik_max_word",
"text": "浙江省杭州市浙江大学"
}

1、将《浙江省》进行中文分词《浙江》《浙江省》《省》
2、将分词结果逐一匹配词条
特点：先分词.，再拿词去匹配倒排索引

GET es_user/_search
{
"query": {
"match": {
"address": "浙江省"
}
}
}

bool match：query的值不会被分词，直接匹配词条，默认and

GET es_user/_search
{
"query": {
"match": {
"address": {
"query": "浙江我爱杭州",
"operator": "and"
}
}
}
}

query的值先分词，再匹配词条

GET es_user/_search
{
"query": {
"match": {
"address": {
"query": "浙江我爱杭州",
"operator": "or"
}
}
}
}

multi_match：多域查询，query的值会分词，然后在多个域中匹配词条，只要其中一个域能匹配即可

GET es_user/_search
{
"query": {
"multi_match": {
"query": "我爱中国",
"fields": ["address","email"]
}
}
}

关键字检索，可以使用match进行检索，因为match是先分词再匹配词条

3、term 查询

term：不分词直接匹配词条，按物品分类或者品牌这类检索，可以使用term检索

GET es_user/_search
{
"query": {
"term": {
"address": {
"value": "杭州好地方"
}
}
}
}

4、prefix 查询

prefix：词条以指定的value为前缀的

GET es_user/_search
{
"query": {
"prefix": {
"address": {
"value": "杭"
}
}
}
}

5、wildcard 查询

wildcard：不分词通配符的方式匹配词条，*指任意内容？指任意一个内容？？指任意两个内容

GET es_user/_search
{
"query": {
"wildcard": {
"address": {
"value": "杭州*"
}
}
}
}
GET es_user/_search
{
"query": {
"wildcard": {
"address": {
"value": "杭州??"
}
}
}
}

6、range 查询

范围查询，gte为大于等于，lte为小于等于

POST es_user/_search
{
"query": {
"range": {
"age": {
"gte": 32,
"lte": 36
}
}
}
}

7、分页查询

from+size：先查询所有，from:从1开始查（从0角标开始） size:每页显示两条

POST es_user/_search
{
"from": 0,
"size": 2,
"query": {
"match_all": {

}
}
}

8、复合查询

must:求交集，must 多个查询单元必须同时匹配

GET es_user/_search
{
"query": {
"bool": {
"must": [
{
"term": {
"address": {
"value": "浙江省"
}
}
},
{
"range": {
"balance": {
"gte": 5600,
"lte": 6000
}
}
}
]
}
}
}

must not：取反，must not 多个查询单元必须都不匹配（address既不能等于“河北河南”，age既不能大于30小于36）

GET es_user/_search
{
"query": {
"bool": {
"must_not": [
{
"match": {
"address": "河北河南"
}
},
{
"range": {
"age": {
"gte": 30,
"lte": 36
}
}
}

]
}
}
}

should：并集，多个查询单元满足其中一个条件即可

GET es_user/_search
{
"query": {
"bool": {
"should": [
{
"term": {
"address": "河南河北"
}
},
{
"range": {
"age": {
"gte": 30,
"lte": 36
}
}
}

]
}
}
}

9、高亮查询

单个域的高亮

GET es_user/_search
{
"query": {
"match": {
"address": "浙江"
}
},
"highlight": {
"fields": {
"address": {}
},
"pre_tags": "<span stytle='color:red>",
"post_tags": "</span>"
}
}

多个域的高亮

GET es_user/_search
{
"query": {
"bool": {
"should": [
{
"match": {
"address": "浙江"
}
},
{
"match": {
"email": "hattiebond"
}
}
]
}
},
"highlight": {
"fields": {
"address": {},
"email": {}
},
"pre_tags": "<font color='red'>",
"post_tags": "</font>"
}
}

10、boosting查询

影响文档分数的因素:

1、当查询的关键字在文档出现的频次越高，分数越高

2、指定的文档内容越短，分数越高,如查找的是黄花鱼，指定文档内容就是黄花鱼

positive、negative、negative_boost 要写完整，不然运行报status:400

设置的negative_boost 值可以把 "_score" 的值，增加或者减少，影响文档默认排序的分数score

GET es_user/_search
{
"query": {
"boosting": {
"positive": {
"match": {
"address": "浙江省"
}
},
"negative": {
"match": {
"job": "后端"
}
},
"negative_boost": 5
}
}
}

11、过滤

查询结果过滤掉不想要的文档

过滤和查询都能起到对结果集的过滤效果，但是查询会影响到文档的评分及排名，而过滤不会。如果我们需要在查询结果中进行过滤，并且不希望过滤条件影响评分，那么就不要把过滤条件作为查询条件来用。而是使用filter方式

影响分数值score：

es7 根据id查询 es批量查询id_es7 根据id查询

过滤 filter是不影响分数的

GET /es_user/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"address": "杭州人"
}
}

],
"filter": [
{
"term": {
"job": "工程师"
}
}
]
}
}
}

12、排序

desc：降序 asc：升序

GET /es_user/_search
{

"query": {
"bool": {
"must": [
{
"match": {
"address": "杭州人"
}
}

],
"filter": [
{
"term": {
"job": "工程师"
}
}
]
}

},
"sort": [
{
"age": {
"order": "desc"
}
}
]
}

13、聚合

Elasticsearch中的聚合，包含多种类型，最常用的两种，一个叫度量（metrics）：，一个叫桶（bucket）

度量（metrics）

分组完成以后，我们一般会对组中的数据进行聚合运算，例如求平均值、最大、最小、求和等，这些在ES中称为度量

比较常用的一些度量聚合方式：

Avg Aggregation：求平均值
Max Aggregation：求最大值
Min Aggregation：求最小值
Percentiles Aggregation：求百分比
Stats Aggregation：同时返回avg、max、min、sum、count等
Sum Aggregation：求和
Top hits Aggregation：求前几
Value Count Aggregation：求总数
……

//求平均年龄、最大年龄、最小年龄以及求和
GET /es_user/_search
{
"query": {
"match": {
"address": "杭州"
}
},
"aggs": {
"age_avg": {
"avg": {
"field": "age"
}
},
"max_age":{
"max": {
"field": "age"
}
},
"min_age":{
"min": {
"field": "age"
}
},
"sum_age":{
"sum": {
"field": "age"
}
}
}
}

桶（bucket）

桶的作用，是按照某种方式（或某个条件）对数据进行分组，每一组数据在ES中称为一个桶比如：职位

size的值，指列出几种数据。

GET /es_user/_search
{
"query": {
"match": {
"address": "杭州市"
}
},
"aggs": {
"job_group": {
"terms": {
"field": "job.keyword",
"size": 10
}
}
}
}

本文章为转载内容，我们尊重原作者对文章享有的著作权。如有内容错误或侵权问题，欢迎原作者联系我们进行内容更正或删除文章。

上一篇：ELK 志系统架构图 elk部署架构

下一篇：python 商用gui pytorch商用

提问和评论都可以，用心的回复会被更多人看到评论

发布评论

相关文章

官方博客	全部文章	热门标签	班级博客
了解我们	网站地图	意见反馈

鸿蒙开发者社区	51CTO学堂
51CTO	软考资讯