目录:
一、基本概念
二、数据生成
三、查询方法
1)Match 匹配查询
2)Multi_match 多匹配查询
3)Match Phrase (短语匹配)
4)Fuzzy Query 模糊查询
5)Wildcard Query(通配符查询)
6)Term 查询
7)Sorted 查询
8)排序分页查询
9)Range 范围查询
10)Filter 过滤查询
11)Multiple Filters 多过滤查询
12)AND查询
四、使用python查询
———————————————————————————————————————
一、基本概念
Index:索引,相当于kafka中的topic
Type:类型,在一个index中,可以定义一个或多个type。Type是index的逻辑分类,相当于kafka中的partition。比如说博客,全部数据储存在一个index中,可以为博客数据定义一个type,评论数据定义一个type
Document:文档,也就是具体的数据,必须存在具体的type的
Shards和Replicas:分片和副本,如果一个index数据占有1T的磁盘空间,就会导致无法放在当个的节点上,即使可以放在上面,查询的相应速度也会很慢,由此产生shard(相当于hdfs的block),可以制定shard的数量,也就是replica副本。
二、数据生成
maven
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>transport</artifactId>
<version>5.2.0</version>
</dependency>
Java代码
package cn.orcale.com.es;
import java.net.InetAddress;
import java.util.Random;
import org.elasticsearch.action.bulk.BulkRequestBuilder;
import org.elasticsearch.action.index.IndexRequestBuilder;
import org.elasticsearch.client.transport.TransportClient;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.common.transport.InetSocketTransportAddress;
import org.elasticsearch.common.xcontent.XContentFactory;
import org.elasticsearch.transport.client.PreBuiltTransportClient;
/***
*
* @author yuhui
*
*/
public class insetDatas{
@SuppressWarnings({ "resource" })
public static void main(String[] args) throws Exception {
String[] brand = {"奔驰","宝马","宝马Z4","奔驰C300","保时捷","奔奔"};
int[] product_price = {10000,20000,30000,40000};
int[] sale_price = {10000,20000,30000,40000};
String[] sale_date = {"2017-08-21","2017-08-22","2017-08-23","2017-08-24"};
String[] colour = {"white","black","gray","red"};
String[] info = {"我很喜欢","Very Nice","不错, 不错 ","我以后还会来的"};
int num = 0;
Random random = new Random();
Settings settings = Settings.builder().put("cluster.name", "elasticsearch")
.build();
@SuppressWarnings("unchecked")
TransportClient client = new PreBuiltTransportClient(settings)
.addTransportAddress(new InetSocketTransportAddress(InetAddress
.getByName("localhost"), 9300));
BulkRequestBuilder bulkRequestBuilder = client.prepareBulk();
for(int i =0 ;i<1000; i++){
num++;
String brandTemp = brand[random.nextInt(6)];
//插入
IndexRequestBuilder indexRequestBuilder = client.prepareIndex("car_shop", "sales", num+"").setSource(
XContentFactory.jsonBuilder().startObject()
.field("num", num)
.field("brand", brandTemp)
.field("colour", colour[random.nextInt(4)])
.field("product_price", product_price[random.nextInt(4)])
.field("sale_price", sale_price[random.nextInt(4)])
.field("sale_date", sale_date[random.nextInt(4)])
.field("info", brandTemp+info[random.nextInt(4)])
.endObject());
bulkRequestBuilder.add(indexRequestBuilder);
}
bulkRequestBuilder.get();
System.out.println("插入完成");
client.close();
}
}
三、查询方法
准备查看条数
GET /car_shop/sales/_count
更多操作:https://www.elastic.co/guide/en/elasticsearch/reference/current/search.html
1)Match 匹配查询
解释:匹配查询可以从“_source”的field中匹配出想要的数据
Example:查询country为cn的这个查询语句匹配的数据
#1)Match 匹配查询
GET /car_shop/sales/_search
{
"query": {
"match": {
"sale_date": {
"query":"2017-08-21"
}
}
}
,
"_source": ["brand", "product_price","sale_price","sale_date","colour"],
"size": 10
}
2)Multi_match 多匹配查询
解释:多匹配查询可以从“_source”的多个field中匹配出想要的数据,其中一个字段有这个文档就满足的即可
Example:从fields为country和relatedword中查询含有cn的数据
#2)Multi_match 多匹配查询
GET /car_shop/sales/_search
{
"query": {
"multi_match": {
"query": "40000",
"fields": ["product_price", "sale_price"]
}
},
"_source": ["brand", "product_price","sale_price","sale_date","colour"],
"size": 1000
}
3)Match Phrase (短语匹配)
解释:为了弥补match匹配的问题(match匹配会进行分词,包含查询中的词中的一个或多个就会被搜索出来。并且根据lucene的评分机制(TF/IDF)来进行评分), Match Phrase是精确匹配。当然使用slot可以将精确匹配的条件放宽,设置slop:2, 结果中符合joox word music的也可以匹配出来。
Example:从fields为country和relatedword中模糊查询含有amazam的数据
#5)Match Phrase (短语匹配)
GET /car_shop/sales/_search
{
"query": {
"match_phrase" : {
"info" : "C300"
}
},
"_source": ["brand","info"],
"size": 1000
}
4)Fuzzy Query 模糊查询
解释:模糊查询是在match和multi_match查询中使用以便解决拼写错误的问题
Example:从fields为country和relatedword中模糊查询含有amazam的数据
#3)模糊查询
GET /car_shop/sales/_search
{
"query": {
"fuzzy": {
"brand": "奔"
}
},
"_source": ["brand", "product_price","sale_price","sale_date","colour"],
"size": 1000
}
5)Wildcard Query(通配符查询)
解释:通配符查询允许制定一个模式来匹配,而不需要制定完整的term(后面介绍)
Example:从fields为country和relatedword中模糊查询含有amazam的数据
#4)Wildcard Query(通配符查询)
GET /car_shop/sales/_search
{
"query": {
"wildcard": {
"brand":"宝*"
}
},
"_source": ["brand", "product_price","sale_price","sale_date","colour"],
"size": 1000
}
6)Term 查询
解释:精确匹配某项,下面是搜索hotword为qq的数据
Example:从fields为country和relatedword中模糊查询含有amazam的数据
#6)Term 查询
GET /car_shop/sales/_search
{
"query": {
"term": {
"brand":"宝"
}
},
"_source": ["brand", "product_price","sale_price","sale_date","colour"],
"size": 5
}
7)Sorted 查询
解释:将输出结果进行排序(支持多层排序),
Example:匹配出keyword为paint的数据并按照searchindex和resultcount排序
【asc正序,desc逆序】
#7)Sorted 查询
GET /car_shop/sales/_search
{
"query": {
"match": {
"brand":"宝马"
}
},
"sort": [{"sale_date": {"order": "desc"}},
{"sale_price": {"order": "desc"}}
],
"_source": ["brand","sale_date","sale_price","colour"],
"size": 1000
}
8)排序分页查询
size每页条数,from 从那条开始
GET /car_shop/sales/_search?size=5
GET /car_shop/sales/_search?size=5&from=0
GET /car_shop/sales/_search?size=5&from=5
#8)排序分页查询
GET /car_shop/sales/_search?size=5&from=5
{
"query": {
"match": {
"brand":"宝马"
}
},
"sort": [{"num": {"order": "desc"}}
],
"_source": ["num","brand","colour"]
}
9)Range 范围查询
解释:给某个field指定范围,查处指定的范围内的数据
Example:取出排名范围为1-7的数据
#8)Range 范围查询
GET /car_shop/sales/_search
{
"query": {
"range" : {
"sale_date": {
"gte": "2017-08-21",
"lte": "2017-08-23"
}
}
},
"sort": [{"sale_date": {"order": "desc"}}
],
"_source": ["brand","sale_date","sale_price","colour"],
"size": 1000
}
9)Filter 精准查找 (SELECT sale_date FROM sales WHERE sale_date = “2017-08-21”)
#9)Filter 精准查找 (SELECT sale_date FROM sales WHERE sale_date = "2017-08-21")
GET /car_shop/sales/_search
{
"query" : {
"constant_score" : {
"filter" : {
"term" : {
"sale_date" : "2017-08-21"
}
}
}
}
}
10)Filter 过滤查询
Example:过滤出范围小于等于3的数据,并且country和relatedword字段中至少有一个含有cn
#9)Filter 过滤查询
GET /car_shop/sales/_search
{
"query": {
"bool": {
"must": {
"multi_match": {
"query": "宝马",
"fields": ["brand"]
}
},
"filter": {
"range": {
"sale_date": {
"gte": "2017-08-21",
"lte": "2017-08-22"
}
}
}
}
},
"sort": [{"sale_date": {"order": "desc"}}
],
"_source": ["brand","sale_date","sale_price","colour"],
"size": 1000
}
11)Multiple Filters 多过滤查询
Example:过滤出设备为ipad,排名小于等于7,国家为cn的数据
#10)Multiple Filters 多过滤查询
GET /car_shop/sales/_search
{
"query": {
"bool": {
"must": {
"multi_match": {
"query": "宝",
"fields": ["brand"]
}
},
"filter": {
"bool": {
"must": {
"range": {
"sale_date": {
"gte": "2017-08-21",
"lte": "2017-08-22"
}
}
},
"should": {
"term": {
"colour": "白"
}
}
}
}
}
},
"sort": [{"sale_date": {"order": "desc"}}
],
"_source": ["brand","sale_date","sale_price","colour"],
"size": 1000
}
12)AND查询
#12AND查询
GET /car_shop/sales/_search
{
"query": {
"bool": {
"filter": [
{ "term": { "colour": "白" }},
{ "term": { "brand": "奔" }},
{ "term": { "sale_price": "20000" }},
{ "term": { "sale_date": "2017-08-21" }}
]
}
}
}
四、使用python查询
from elasticsearch import Elasticsearches = Elasticsearch(['59.110.53.38:9200'])
query={ ··· }
res = es.search(index='app_store', body=query, doc_type='hotsearch_related')
for i in xrange(0, 10):
print(res['hits']['hits'][i]["_source"])