elasticsearch作为一款搜索引擎,应用于数据库无法承受前端的搜索压力时,采用其进行数据的搜索。可以大并发架构设计中一种选择,以下是elasticsearch搜索引擎的部分规则,在实际应用中可以让我们快速熟悉和帮助解决一些问题。
01》不进行分词的索引建立
URL: es_index_test
{
"settings": {
"index": {
"number_of_shards": "4",
"number_of_replicas": "1"
}
},
"mappings": {
"es_index_type_test": {
"properties": {
"productId": {
"type": "text"
},
"productName": {
"type": "keyword",
"index": "true"
}
}
}
}}说明:“productName”属性建立索引时,将其设置为不进行分词设置。利用wildcard搜索方式,可以实现MYSQL中的LIKE效果。例如:文档{"productId":10001,"productName":"山鸡图"},可以用{"query":{"wildcard":{"productName":"*鸡*"}}},搜索出来。
02》需要进行分词的索引建立
URL: es_index_test
{
"settings": {
"index": {
"number_of_shards": "4",
"number_of_replicas": "1"
}
},
"mappings": {
"es_index_type_test": {
"properties": {
"productId": {
"type": "text"
},
"productName": {
"type": "text",
"analyzer": "ik_max_word",
"search_analyzer": "ik_max_word"
}
}
}
}}说明:“productName”属性建立索引时,将其设置为进行分词设置。elasticsearch默认针对中文的分词是按照一个中文字符,就是一个分词。例如:文档{"productId":10001,"productName":"山鸡图"}中,会拆分为“山”、“鸡”和“图”三个分词。中文分词的拆分,可以安装ik分词器进行分词拆分。例如:文档{"productId":10001,"productName":"山鸡图"}中,会拆分为“山鸡”和“图”两个分词。中文短句具体拆分成哪些分词是ik分词器的字典来识别的,此字典可以根据实际情况进行调整。
03》忽略大小写的索引建立
URL:
es_index_test
{
"settings": {
"index": {
"number_of_shards": "10",
"number_of_replicas": "3"
},
"analysis": {
"normalizer": {
"es_normalizer": {
"filter": [
"lowercase",
"asciifolding"
],
"type": "custom"
}
}
}
},
"mappings": {
"es_index_test": {
"properties": {
"productId": {
"type": "text"
},
"productName": {
"type": "keyword",
"normalizer": "es_normalizer",
"index": "true"
}
}
}
}
}说明:“productName”属性建立索引时,将其设置为忽略大小写。
04》分词查询
URL:es_index_test/es_index_type_test/_analyze
- ik分词器以“ik_max_word”方式拆分
{
"analyzer":"ik_max_word",
"text":"中华人民共和国"
}- 结果
{
"tokens": [
{
"end_offset": 7,
"start_offset": 0,
"position": 0,
"type": "CN_WORD",
"token": "中华人民共和国"
},
{
"end_offset": 4,
"start_offset": 0,
"position": 1,
"type": "CN_WORD",
"token": "中华人民"
},
{
"end_offset": 2,
"start_offset": 0,
"position": 2,
"type": "CN_WORD",
"token": "中华"
},
{
"end_offset": 3,
"start_offset": 1,
"position": 3,
"type": "CN_WORD",
"token": "华人"
},
{
"end_offset": 7,
"start_offset": 2,
"position": 4,
"type": "CN_WORD",
"token": "人民共和国"
},
{
"end_offset": 4,
"start_offset": 2,
"position": 5,
"type": "CN_WORD",
"token": "人民"
},
{
"end_offset": 7,
"start_offset": 4,
"position": 6,
"type": "CN_WORD",
"token": "共和国"
},
{
"end_offset": 6,
"start_offset": 4,
"position": 7,
"type": "CN_WORD",
"token": "共和"
},
{
"end_offset": 7,
"start_offset": 6,
"position": 8,
"type": "CN_CHAR",
"token": "国"
}
]}- ik分词器以“ik_smart”方式拆分
{
"analyzer":"ik_smart",
"text":"中华人民共和国"}- 结果
{
"tokens": [
{
"end_offset": 7,
"start_offset": 0,
"position": 0,
"type": "CN_WORD",
"token": "中华人民共和国"
}
]}- ES默认
{
"text":"中华人民共和国"}
• 结果
{
"tokens": [
{
"end_offset": 1,
"start_offset": 0,
"position": 0,
"type": "<IDEOGRAPHIC>",
"token": "中"
},
{
"end_offset": 2,
"start_offset": 1,
"position": 1,
"type": "<IDEOGRAPHIC>",
"token": "华"
},
{
"end_offset": 3,
"start_offset": 2,
"position": 2,
"type": "<IDEOGRAPHIC>",
"token": "人"
},
{
"end_offset": 4,
"start_offset": 3,
"position": 3,
"type": "<IDEOGRAPHIC>",
"token": "民"
},
{
"end_offset": 5,
"start_offset": 4,
"position": 4,
"type": "<IDEOGRAPHIC>",
"token": "共"
},
{
"end_offset": 6,
"start_offset": 5,
"position": 5,
"type": "<IDEOGRAPHIC>",
"token": "和"
},
{
"end_offset": 7,
"start_offset": 6,
"position": 6,
"type": "<IDEOGRAPHIC>",
"token": "国"
}
]}说明:以上三种分词拆分的方式不一样,最终产生分词的结果不相同。
05》数据查询-wildcard
URL:
es_index_test/es_index_type_test/_search
{
"query":{"wildcard":{"productName": "山鸡图" }}
}说明:wildcard种查询方式需要结合方法支持匹配符合,例如:*鸡*,ES会去匹配,在JAVA程序中构建采用。JAVA程序中采用QueryBuilders类的wildcardQuery(String name, Object text)方法。
06》数据查询-match
URL:es_index_test/es_index_type_test/_search
{
"query":{"match":{"productName": "山鸡图" }}
}
说明:查询时会根据分词进行匹配,例如:“山鸡图”ES拆分为“山鸡”和“图”两个分词到ES搜索引擎内筛选出记录,最后将符合记录的数据返回。返回的记录可能包含,山鸡汤(包含“山鸡”)和山虎图(包含“图”分词)。JAVA程序中采用QueryBuilders类的matchQuery(String name, Object text)方法。
07》数据查询-term
URL: es_index_test/es_index_type_test/_search
{
"query":{
"term":{
"productName":"山鸡图"
}
}
}
说明:只有分词完全匹配“山鸡图”这三个字后,才可以返回数据。JAVA程序中采用QueryBuilders类的termQuery(String name, Object value)方法。
08》数据查询-terms
URL: es_index_test/es_index_type_test/_search
{
"query":{
"terms":{
"productName":["山鸡图","山虎图"]
}
}}说明:分词匹配“山鸡图”和“山虎图”返回记录。JAVA程序中采用QueryBuilders类的termsQuery(String name, String... values)方法。
09》删除查询出来的结果集
URL:es_index_test/es_index_type_test/_delete_by_query
{
"query":{"wildcard":{"productName": "*鸡*" }}}说明:删除产品名称包含“鸡”字文档。
10》elasticsearch中JAVA实例
1、ElasticSearchProperties
package com.jd.ccc.sys.biz.yb.op.notice.config;
import org.springframework.boot.context.properties.ConfigurationProperties;
import org.springframework.stereotype.Component;
import lombok.Data;
/**
* ElasticSearch搜索引擎配置参数
* 具体参数的配置信息在yml文件内
*
* @create 2018-5-10
*
*/
@Data
@Component
@ConfigurationProperties(prefix = "elasticsearch")
public class ElasticSearchProperties {
/**
* 集群名
*/
private String clusterName;
/**
* 索引名称
*/
private String indexName;
/**
* 类型名称
*/
private String typeName;
/**
* 主节点
*/
private String masterNode;
/**
* 从节点
*/
private String slaveNodes;
}2、ElasticSearchConfig
package com.jd.ccc.sys.biz.yb.op.notice.config;
import .InetAddress;
import .UnknownHostException;
import org.elasticsearch.client.Client;
import org.elasticsearch.client.transport.TransportClient;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.common.transport.InetSocketTransportAddress;
import org.elasticsearch.transport.client.PreBuiltTransportClient;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.context.properties.EnableConfigurationProperties;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
/**
*
* 初始化一个ES搜索引擎配置
*
* @create 2018-5-10
*
*/
@Configuration
@EnableConfigurationProperties(ElasticSearchProperties.class)
public class ElasticSearchConfig {
private static final Logger LOGGER = LoggerFactory.getLogger(ElasticSearchConfig.class);
@Autowired
private ElasticSearchProperties elasticSearchProperties;
private static final String SYS_PROPERTY="es.set.netty.runtime.available.processors";
private static final String CLUSTER_NAME="";
private static final String CLIENT_SNIFF="client.transport.sniff";
@Bean(name="elasticSearchCluster")
public Client getClient() {
System.setProperty(SYS_PROPERTY, "false");
Settings settings = Settings.builder().put(CLUSTER_NAME, elasticSearchProperties.getClusterName())
.put(CLIENT_SNIFF, false).build();
TransportClient transportClient = null;
try {
String[] masters = elasticSearchProperties.getMasterNode().split(":");
transportClient = new PreBuiltTransportClient(settings).addTransportAddress(
new InetSocketTransportAddress(InetAddress.getByName(masters[0]), Integer.parseInt(masters[1])));
String[] slaveNodes = elasticSearchProperties.getSlaveNodes().split(",");// 逗号分隔
//遍历从库信息
for (String node : slaveNodes) {
String[] ipPort = node.split(":");// 冒号分隔
if (ipPort.length == 2) {
transportClient.addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName(ipPort[0]),
Integer.parseInt(ipPort[1])));
}
}
return transportClient;
} catch (UnknownHostException e) {
LOGGER.error("ES 客户端连接失败.{}",e);
return null;
}
}
}3、服务层操作
/**
* 查询模糊搜索产品列表的总记录数
*
* @param likeProductName
* 模糊搜索产品名称的关键字
* @param type
* 产品类型
* @return 总记录数
*
* @create 2018-5-9
*/
private Integer queryCount(String likeProductName, String type) {
BoolQueryBuilder builder=this.builderQueryData(likeProductName, type);
try {
SearchResponse searchResponse = elasticSearchCluster.prepareSearch(elasticSearchProperties.getIndexName())
.setTypes(elasticSearchProperties.getTypeName()).setQuery(builder)
.setSearchType(SearchType.DEFAULT).get();
SearchHits hits = searchResponse.getHits();
return (int)hits.getTotalHits();
}catch(Exception e) {
LOGGER.error("Server access failure,{}",e);
return 0;
}
}
/**
* 拼接模糊查询筛选条件
*
* @param likeProductName
* 模糊搜索产品名称的关键字
* @param type
* 产品类型
* @return 筛选条件字符串
*
* @create 2018-5-9
*/
private BoolQueryBuilder builderQueryData(String likeProductName, String type) {
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
boolQueryBuilder.must(QueryBuilders.matchQuery(PRODUCT_STATUS, "03"));
if(StringUtils.isNotBlank(likeProductName)) {
boolQueryBuilder.must(QueryBuilders.wildcardQuery(PRODUCT_NAME,"*"+likeProductName+"*"));
}
// 类型不为空
if (StringUtils.isNotBlank(type)) {
String[] types = type.split(",");
if (types.length == 1) {
boolQueryBuilder.must(QueryBuilders.matchQuery(INST_TYPE,type));
} else {
boolQueryBuilder.must(QueryBuilders.termsQuery(INST_TYPE, types));
}
}
LOGGER.debug("wild card query-->{}",boolQueryBuilder.toString());
return boolQueryBuilder;
}
/**
* 模糊查询商品列表数据
* @param likeProductName 模糊搜索产品名称的关键字
* @param type 产品类型
* @param startIndex 开始索引
* @param pageSize 每页大小
* @returnW
*
* @create 2018-5-9
*/
private List<String> queryData(String likeProductName, String type, int startIndex, int pageSize) {
List<String> resultList = new ArrayList<>();
BoolQueryBuilder builder=this.builderQueryData(likeProductName, type);
try {
SearchResponse searchResponse = elasticSearchCluster.prepareSearch(elasticSearchProperties.getIndexName())
.setTypes(elasticSearchProperties.getTypeName()).setQuery(builder)
.setSearchType(SearchType.DEFAULT).setFrom(startIndex).setSize(pageSize).get();
SearchHit[] hits = searchResponse.getHits().getHits();
for (SearchHit hit : hits) {
resultList.add(hit.getSourceAsString());
}
}catch(Exception e) {
LOGGER.error("Server access failure,{}",e);
}
return resultList;
}
















