前言
收集大量的日志信息之后,把这些日志存放在哪里?才能对其日志内容进行搜素呢?MySQL?
1.MySQL海量数据下全文检索效率低
如果MySQL里存储了1000W条这样的数据,每条记录的details字段有128个字。
用户想要查询details字段包含“ajax”这个关键词的记录。
select * from tb_log where log_details like "%ajax%";
使用like模糊查询且左边有通配符,会导致索引失效;
每次执行这条SQL语句,都需要逐一查询logtable中每条记录,最头痛的是找到这条记录之后,每次还要对这条记录中details字段里的文本内容进行全文扫描。
判断这个当前记录中的details字段是的内容否包含 “ajax”?有可能会查询 10000w*128次.
所以想要支持搜素details字段的Text内容的情况下,把海量的日志信息存在MySQL中是不太合理的。
2.MySQL无法实现分词查询功能
如果用户不明确搜索目的,虽然在电商平台上搜索了1个关键词 ‘平板电视’,但也想看看平板电脑、液晶电视等商品;
select * from tb_good where good_name like "%平板电视%";
以上SQL语句只能查询到商品名称包含“平板电视“4个字连在一起的平板电视商品;
无法按照自然语言习对关键词进行分词处理,导致用户无法获得丰富的查询结果,例如:把商品名称包含“电视“和“平板”的商品也查询出来;
ES不是银弹只做搜索功能,否则搭建ES维护起来也复杂;
一、Elasticsearch简介
分布式全文检索引擎;
Elasticsearch是一个基于Lucene的分布式、高性能、可伸缩的搜素和分析系统,它对外提供了RESTful web API。
ElasticSearch和MySQL相比有以下区别
- Mysql数据操作具备事务性,而ElasticSearch没有
- MySQL支持外键,而ElasticSearch不支持
- Mysql采用B+树索引,而ElasticSearch采用倒排索引
1.倒排索引
ElasticSearch之所以支持全文检索,1大核心原因是ES在存储数据的时候可以对数据进行分词并构建倒排索引,这是一种典型的空间换时间思想;
倒排索引是1种用于全文搜索的数据结构
倒排索引不是由记录来确定属性值,而是由属性值来确定记录的位置,因而称为倒排索引(inverted index)。
倒排索引将文档中的每1个单词映射到包含该单词的所有文档的列表中
倒排索引在文本搜索和信息检索中广泛应用,如搜索引擎、网站搜索、文本分类等场景中。
2.全文检索
全文检索:把用户输入的关键词也进行分词,利用倒排索引,快速锁定关键词出现在那些文档。
说白了就是根据value查询key(根据文档中内容关键字,找到该该关键字所在的文档的)而非根据key查询value。
3.Lucene
Lucene是apache软件基金会4 jakarta项目组的一个java子项目,是一个开放源代码的全文检索引擎JAR包。帮助我我们实现了以上的需求。
lucene实现倒排索引之后,那么海量的数据如何分布式存储?如何高可用?集群节点之间如何管理?这是Elasticsearch实现的功能。
常说的ELK是Elasticsearch(全文搜素)+Logstash(内容收集)+Kibana(内容展示)三大开源框架首字母大写简称。
4.分片
分片是Elasticsearch中数据存储的基本单位。
Elasticsearch是1个分布式搜索引擎,它允许将1个索引分解成多个部分,每个部分都存储在不同的节点上。
这种分布式的存储方式使得Elasticsearch能够处理大量的数据,同时保持高效的查询性能。
1个分片默认可以存储最大约20亿个文档
索引的分片完成分配后,由于索引的路由机制,将不能重新修改分片数量。
5.分片副本
副本是对分片的复制,用于提供高可用性和故障恢复能力。
每个主分片都可以有1个或多个副本分片。
如果主分片所在的节点发生故障,副本分片可以用于提供服务,从而确保数据的高可靠性。
副本还允许查询请求在多个节点上并行处理,从而提高查询性能。
默认情况下,Elasticsearch为每个索引创建5个主分片,并为每个主分片创建1个副本分片。
6.分片和分片副本总结
分片和副本在Elasticsearch中扮演着不同的角色:
- 分片负责数据的分布存储
- 副本则负责数据的复制以提供高可用性和查询性能
用户可以根据自己的需求调整分片和副本的数量,以优化存储和查询性能。
7.ES底层写入流程
二、Elasticsearch安装
基于docker安装ElasticSearch与kibana客户端;
1.安装ElasticSearch
1.1.下载ElasticSearch7.10.1镜像
注意ES的版本要和客户端的依赖包版本保存一致;
docker pull elasticsearch:7.10.1
1.2.创建ElasticSearch配置和数据存储目录
mkdir -p /mydata/elasticsearch/{config,data,plugins}
1.3.配置文件配置ip信息
echo "http.host: 0.0.0.0" > /mydata/elasticsearch/config/elasticsearch.yml
1.4.配置目录权限
chmod -R 775 /mydata/elasticsearch/
1.5.启动容器并配置端口映射与目录映射
docker run --name elasticsearch -p 9200:9200 -p 9300:9300 \
-e "discovery.type=single-node" \
-e ES_JAVA_OPTS="-Xms256m -Xmx1024m" \
-v /mydata/elasticsearch/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml \
-v /mydata/elasticsearch/data:/usr/share/elasticsearch/data \
-v /mydata/elasticsearch/plugins:/usr/share/elasticsearch/plugins \
-d elasticsearch:7.10.1
1.6.确定是否启动成功
2.安装Kibana
Kibana是1个ElasticSearch的客户端,可以通过可视化界面完成对于ElasticSearch的各种操作。
2.1.下载Kibana7.10.1镜像
docker pull kibana:7.10.1
2.2.创建配置文件夹
[root@itcast ~]# mkdir -p /mydata/kibana/config
[root@itcast ~]# cd /mydata/kibana/config
[root@itcast config]# touch kibana.yml
[root@itcast config]#
2.3.创建并修改配置文件
## ** THIS IS AN AUTO-GENERATED FILE **
## Default Kibana configuration for docker target
##根据自己实际IP修改elasticsearch地址
server.name: kibana
server.host: "0"
elasticsearch.hosts: [ "http://192.168.56.18:9200" ]
xpack.monitoring.ui.container.elasticsearch.enabled: true
#开启中文配置
i18n.locale: "zh-CN"
2.4.启动容器
docker run -d \
> --name=kibana \
> --restart=always \
> -p 5601:5601 \
> -v /mydata/kibana/config/kibana.yml:/usr/share/kibana/config/kibana.yml \
> kibana:7.10.1
2.5.访问效果如下
3.安装DockerPortainer
Docker容器在修改完配置之后很容易出现故障导致启动失败;
此时就需要1个图形化的管理工具,对容器进行可视化操作,也方便查看日志,快速定位问题所在;
3.1.下载portainer镜像
docker pull portainer/portainer
3.2.启动portainer容器
portainer也是1个容器但是这个容器是用来管理其他容器的;
docker pull portainer/portainer
2.2 启动portainer
#创建文件存储文件
docker volume create portainer_data
#创建并启动容器 --restart=always 开机自动启动`
docker run -d -p 9000:9000 --name=portainer --restart=always
-v /var/run/docker.sock:/var/run/docker.sock -v portainer_data:/data portainer/portainer
3.3.管理容器
最后确保以下3个容器都已经正常运行;
三、Elasticsearch使用
ES在使用时,会涉及到5个核心概念:
- 索引(Index)
- 映射(Mapping)
- 域(Field)
- 文档(Document)
- 倒排索引
在老版本ElasticSearch中还有1个概念Type,用于进行数据分类,但是在ES7开始已经将Type移除;
ElasticSearch | Mysql |
索引(Index) | 表(Table) |
映射(Mapping) | 表结构 |
域(Field) | 字段列(Column) |
文档(Document) | 一条数据(Row) |
1.索引(Index)
索引相当于关系型数据库中的一张表,1个index(索引)中包含若干个document(文档);
1.1.添加索引
先添加索引在指定索引的映射
#先创建索引
PUT student
#再补充索引中的映射
PUT student/_mapping
{
"properties":{
"username":{
"type":"text"
},
"age":{"type":"integer"},
"birthday":{"type":"date","format":"yyyy-MM-dd" }
}
}
添加索引并指定映射
PUT student
{
"mappings":{
"properties":{
"username":{"type":"text"},
"age":{"type":"integer"},
"birthday":{"type":"date","format":"yyyy-MM-dd" }
}
}
}
1.2.查询索引
GET person
1.3.查询多个索引
PUT person1
GET person,person1
1.4. 查询所有索引信息
GET _all
1.5.删除索引
DELETE person1
2.别名机制
由于倒排索引的缘故,在ES中无法删除字段,也无法修改字段类型;
所有我们一般对外暴露index(索引)的别名,而不是真正的索引名称;
如果在创建索引的时候或者在映射中新增字段,指定了错误的映射,有2种解决方案;
2.1.索引没有投入使用
如果索引中没有导入数据,删除索引再重新指定映射;
2.2.索引已经投入使用
- 新创建1个正确的索引
- 把之前错误索引中数据导入到新建正确索引中
- 使用别名指向新的正确索引;
#模拟错误:新增了1个student1索引
PUT student1
PUT student1/_mapping
{
"properties":{
"birtday":{"type":"text"}
}
}
#创建别名student--》student1
POST _aliases
{
"actions": [
{
"add": {
"index": "student1",
"alias": "student"
}
}
]
}
#发现指定错了映射,birtday字段应该为date类型
GET student1
#开始修改,新增1个索引指定正确的映射student2,设置birtday字段为date类型
PUT student2
PUT student2/_mapping
{
"properties":{
"birthday":{"type":"date","format":"yyyy-MM-dd" }
}
}
#删除之前指定的别名 student--》student1
POST _aliases
{
"actions": [
{
"remove": {
"index": "student1",
"alias": "student"
}
}
]
}
#创建新的别名student--》student2
POST _aliases
{
"actions": [
{
"add": {
"index": "student2",
"alias": "student"
}
}
]
}
#查看新增的索引student2确定索引也是正确的
GET student2
#删除student1
DELETE student1;
使用Java程序把之前的数据填充到student2索引中
3.域(Field)
域(Field)相当于数据表中的字段列;
创建完索引之后需要在索引中设置域的相关信息,如:域的名称,域的数据类型等。这个过程称为映射(Mapping);
ES中的域(字段)支持以下几种数据类型
3.1.字符串
- text:会进行分词,如华为手机,会分成:华为,手机。 被分出来的每一个词,称为term(词条)
- keyword:不会进行分词,如华为手机,只有一个词条,即华为手机。
3.2.数值
- long:带符号64位整数
- integer:带符号32位整数
- short:带符号16位整数
- byte:带符号8位整数
- double:双精度64位浮点数
- float:单精度32位浮点数
- half_float:半精度16位浮点数
3.3.布尔:
- boolean
3.4.二进制:
- binary
3.5.日期:
- date
3.6.范围类型:
- integer_range
- float_range
- long_range
- double_range
- date_range
3.7.数组
3.8.对象
4.文档(Document)
ES中最小的数据单元,代表索引中的一条数据,通常是使用json的数据格式表示的;
4.1.添加文档
4.1.1.添加文档,手动设置id
POST person/_doc/1
{
"name":"张三",
"age":18,
"address":"北京"
}
4.1.2.添加文档,自动生成id
POST person/_doc
{
"name":"李四",
"age":20,
"address":"北京"
}
4.2.查询文档
4.2.1.根据id查询文档
GET person/_doc/1
4.2.2.查询所有文档
GET person/_search
4.2.3.查询条件
GET logstash-2022.12.12/_search
{
"_source": {
"includes":["@timestamp","message","stream"]
},
"sort": [
{
"@timestamp": {
"order": "desc"
}
}
],
"from": 1,
"size": 3
}
4.3.修改文档
PUT person/_doc/1
{
"name": "张三丰",
"age": 180,
"address": "武当山"
}
4.4.删除文档
DELETE person/_doc/1
四、SpringBoot操作ElasticSearch
使用restHighLevelClient操作ES的流程如下:
1.构建查询请求
2.处理ES响应的查询结果集
1.SpringBoot整合ElasticSearch
1.1.建立Maven工程,并引入相关坐标
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
</dependency>
<!--引入es的坐标-->
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-high-level-client</artifactId>
<version>7.10.1</version>
</dependency>
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-client</artifactId>
<version>7.10.1</version>
</dependency>
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch</artifactId>
<version>7.10.1</version>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<scope>test</scope>
</dependency>
</dependencies>
pom.xml
1.2.application.yml
elasticsearch:
host: 192.168.56.18
port: 9200
1.3.创建启动类EsApplication
package com.zhanggen.es.demo;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
@SpringBootApplication
public class EsApplication {
public static void main(String[] args) {
SpringApplication.run(EsApplication.class,args);
}
}
EsApplication.java
1.4.创建es配置类
package com.zhanggen.es.demo.config;
import org.apache.http.HttpHost;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestClientBuilder;
import org.elasticsearch.client.RestHighLevelClient;
import org.springframework.boot.context.properties.ConfigurationProperties;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
@Configuration
@ConfigurationProperties(prefix="elasticsearch")
public class ElasticSearchConfig {
private String host;
private int port;
public String getHost() {
return host;
}
public void setHost(String host) {
this.host = host;
}
public int getPort() {
return port;
}
public void setPort(int port) {
this.port = port;
}
@Bean
public RestHighLevelClient restHighLevelClient(){
RestClientBuilder builder = RestClient.builder(new HttpHost(host, port, "http"));
builder.setRequestConfigCallback(requestConfigBuilder ->{
requestConfigBuilder.setConnectionRequestTimeout(500000);
requestConfigBuilder.setSocketTimeout(500000);
requestConfigBuilder.setConnectTimeout(500000);
return requestConfigBuilder;
});
return new RestHighLevelClient(builder);
}
}
ElasticSearchConfig.java
1.5.创建测试类ESTest
package com.zhanggen.es.demo;
import org.elasticsearch.client.RestHighLevelClient;
import org.junit.runner.RunWith;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;
import org.springframework.test.context.junit4.SpringRunner;
@SpringBootTest
@RunWith(SpringRunner.class)
public class ESTest {
@Autowired
private RestHighLevelClient restHighLevelClient;
}
ESTest.java
2.操作索引
对ES中索引的增、删、查操作
package com.itheima.es.demo;
import org.elasticsearch.action.admin.indices.delete.DeleteIndexRequest;
import org.elasticsearch.action.support.master.AcknowledgedResponse;
import org.elasticsearch.client.IndicesClient;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.client.indices.CreateIndexRequest;
import org.elasticsearch.client.indices.CreateIndexResponse;
import org.elasticsearch.client.indices.GetIndexRequest;
import org.elasticsearch.client.indices.GetIndexResponse;
import org.elasticsearch.cluster.metadata.MappingMetadata;
import org.elasticsearch.common.xcontent.XContentType;
import org.junit.Test;
import org.junit.runner.RunWith;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;
import org.springframework.test.context.junit4.SpringRunner;
import java.io.IOException;
import java.util.Map;
@SpringBootTest
@RunWith(SpringRunner.class)
public class ESTest {
//注入操作ElasticSearch的客户端
@Autowired
private RestHighLevelClient restHighLevelClient;
//测试创建索引
@Test
public void createIndexTest() throws IOException {
IndicesClient indicesClient = restHighLevelClient.indices();
//设置索引名称
CreateIndexRequest createIndexRequest=new CreateIndexRequest("student");
//定义索引的映射(结构)
String mappingInfo = "{\n" +
" \"properties\":{\n" +
" \"name\":{\n" +
" \"type\":\"keyword\"\n" +
" },\n" +
" \"age\":{\n" +
" \"type\":\"integer\"\n" +
" },\n" +
" \"address\":{\n" +
" \"type\":\"text\"\n" +
" }\n" +
" }\n" +
"}";
createIndexRequest.mapping(mappingInfo, XContentType.JSON);
CreateIndexResponse response = indicesClient.create(createIndexRequest, RequestOptions.DEFAULT);
System.out.println(response.isAcknowledged());
}
//测试查询索引
@Test
public void findIndexTest() throws IOException {
IndicesClient indicesClient = restHighLevelClient.indices();
GetIndexRequest getIndexRequest = new GetIndexRequest("student");
GetIndexResponse response = indicesClient.get(getIndexRequest, RequestOptions.DEFAULT);
Map<String, MappingMetadata> mappings = response.getMappings();
for (String key : mappings.keySet()) {
System.out.println(key+"==="+mappings.get(key).getSourceAsMap());
}
}
//删除索引
@Test
public void delIndex() throws IOException {
IndicesClient indicesClient = restHighLevelClient.indices();
DeleteIndexRequest delIndexRequest = new DeleteIndexRequest("student");
AcknowledgedResponse response = indicesClient.delete(delIndexRequest, RequestOptions.DEFAULT);
System.out.println(response.isAcknowledged());
}
}
ESTest.java
3.操作文档
package com.itheima.es.demo;
import com.alibaba.fastjson.JSON;
import org.elasticsearch.action.admin.indices.delete.DeleteIndexRequest;
import org.elasticsearch.action.delete.DeleteRequest;
import org.elasticsearch.action.delete.DeleteResponse;
import org.elasticsearch.action.get.GetRequest;
import org.elasticsearch.action.get.GetResponse;
import org.elasticsearch.action.index.IndexRequest;
import org.elasticsearch.action.index.IndexResponse;
import org.elasticsearch.action.support.master.AcknowledgedResponse;
import org.elasticsearch.client.IndicesClient;
import org.elasticsearch.client.Request;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.client.indices.CreateIndexRequest;
import org.elasticsearch.client.indices.CreateIndexResponse;
import org.elasticsearch.client.indices.GetIndexRequest;
import org.elasticsearch.client.indices.GetIndexResponse;
import org.elasticsearch.cluster.metadata.MappingMetadata;
import org.elasticsearch.common.xcontent.XContentType;
import org.junit.Test;
import org.junit.runner.RunWith;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;
import org.springframework.test.context.junit4.SpringRunner;
import java.io.IOException;
import java.util.HashMap;
import java.util.Map;
@SpringBootTest
@RunWith(SpringRunner.class)
public class ESTestDocument {
//注入操作ElasticSearch的客户端
@Autowired
private RestHighLevelClient restHighLevelClient;
//测试在student索引中创建1个文档
@Test
public void addDoc() throws IOException {
IndexRequest indexRequest = new IndexRequest("student").id("1");
indexRequest.source("{\n" +
" \"name\":\"张根\",\n" +
" \"age\":18,\n" +
" \"address\":\"河北\"\n" +
"}", XContentType.JSON);
IndexResponse response = restHighLevelClient.index(indexRequest, RequestOptions.DEFAULT);
System.out.println(response.status());
}
//测试在student索引中创建1个文档2
@Test
public void addDoc1() throws IOException {
IndexRequest indexRequest = new IndexRequest("student").id("1");
//本质上就是在请求体包含1个json数据
HashMap<String, Object> map = new HashMap<>();
map.put("name","张根");
map.put("age",18);
map.put("address","河北");
indexRequest.source(JSON.toJSONString(map),XContentType.JSON);
IndexResponse response = restHighLevelClient.index(indexRequest, RequestOptions.DEFAULT);
System.out.println(response.status());
}
//测试查询文档
@Test
public void findDocTest() throws IOException {
GetRequest getRequest = new GetRequest("student","1");
GetResponse response = restHighLevelClient.get(getRequest, RequestOptions.DEFAULT);
String str = response.getSourceAsString();
Map<String, Object> map = response.getSourceAsMap();
System.out.println(map);
}
//测试删除文档
@Test
public void delDocTest() throws IOException {
DeleteRequest deleteRequest = new DeleteRequest("student", "1");
DeleteResponse response = restHighLevelClient.delete(deleteRequest,RequestOptions.DEFAULT);
System.out.println(response.status());
}
}
ESTestDocument.java
4.从MySQL中批量导入数据到ES
package com.zhanggen.es.service.impl;
import com.alibaba.fastjson.JSON;
import com.alibaba.fastjson.serializer.SerializerFeature;
import com.baomidou.mybatisplus.core.conditions.query.LambdaQueryWrapper;
import com.baomidou.mybatisplus.extension.plugins.pagination.Page;
import com.itheima.es.entity.HotelEntity;
import com.itheima.es.mapper.HotelMapper;
import com.itheima.es.service.HotelService;
import org.apache.lucene.search.TotalHits;
import org.elasticsearch.action.bulk.BulkRequest;
import org.elasticsearch.action.bulk.BulkResponse;
import org.elasticsearch.action.index.IndexRequest;
import org.elasticsearch.action.search.SearchRequest;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.common.xcontent.XContentType;
import org.elasticsearch.index.query.QueryBuilder;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.search.SearchHit;
import org.elasticsearch.search.SearchHits;
import org.elasticsearch.search.builder.SearchSourceBuilder;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Service;
import java.io.IOException;
import java.util.*;
@Service
public class HotelServiceImpl implements HotelService {
@Autowired
private HotelMapper hotelMapper;
@Autowired
private RestHighLevelClient restHighLevelClient;
//批量导入
@Override
public int addDocToES() {
int esTotal = 0;
Long currentPage = 1L;
Page<HotelEntity> page = new Page<>(currentPage, 200);
LambdaQueryWrapper<HotelEntity> queryWrapper = new LambdaQueryWrapper<>();
Integer integer = hotelMapper.selectCount(queryWrapper);
Page<HotelEntity> hotelEntityPage = hotelMapper.selectPage(page, queryWrapper);
//先算出数据库中一共有多少页
long totalpage = hotelEntityPage.getPages();
for (currentPage = 1L; currentPage <= totalpage; currentPage++) {
//批量导入
queryWrapper = new LambdaQueryWrapper<>();
hotelEntityPage = hotelMapper.selectPage(page.setCurrent(currentPage), queryWrapper);
//ES批量导入的API:请求集合
BulkRequest bulkRequest = new BulkRequest();
for (HotelEntity hotelEntity : hotelEntityPage.getRecords()) {
String data = JSON.toJSONStringWithDateFormat(hotelEntity, "yyyy-MM-dd", SerializerFeature.WriteDateUseDateFormat);
IndexRequest indexRequest = new IndexRequest("hotel").source(data, XContentType.JSON);
bulkRequest.add(indexRequest);
}
try {
BulkResponse response = restHighLevelClient.bulk(bulkRequest, RequestOptions.DEFAULT);
} catch (IOException e) {
e.printStackTrace();
}
esTotal += hotelEntityPage.getRecords().size();
}
return esTotal;
}
//查询全部
@Override
public Map<String, Object> matchAllQuery() {
//1.构建查询
SearchRequest hotelSearch = new SearchRequest("hotel");
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
QueryBuilder queryBuilder = QueryBuilders.matchAllQuery();
searchSourceBuilder.query(queryBuilder);
hotelSearch.source(searchSourceBuilder);
//返回的结果
Map<String, Object> map = new HashMap<>();
try {
SearchResponse searchResponse = restHighLevelClient.search(hotelSearch, RequestOptions.DEFAULT);
SearchHits searchResponseHits = searchResponse.getHits();
//总条目
long totalHits = searchResponseHits.getTotalHits().value;
List<HotelEntity> list = new ArrayList<>();
SearchHit[] searchHits = searchResponseHits.getHits();
if (searchHits != null || searchHits.length > 0) {
for (SearchHit searchHit : searchHits) {
String sourceAsString = searchHit.getSourceAsString();
list.add(JSON.parseObject(sourceAsString, HotelEntity.class));
}
}
map.put("list", list);
map.put("totalResultSize", totalHits);
} catch (IOException e) {
e.printStackTrace();
}
return map;
}
//分页查询
@Override
public Map<String, Object> pageQuery(int current, int size) {
SearchRequest searchRequest = new SearchRequest("hotel");
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
QueryBuilder queryBuilder = QueryBuilders.matchAllQuery();
searchSourceBuilder.query(queryBuilder);
//设置分页
searchSourceBuilder.from((current - 1) * size);
searchSourceBuilder.size(size);
searchRequest.source(searchSourceBuilder);
Map<String, Object> resultMap = new HashMap<>();
try {
SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
SearchHits hits = searchResponse.getHits();
long totalHits = hits.getTotalHits().value;
SearchHit[] searchHits = hits.getHits();
List<HotelEntity> list = new ArrayList<>();
for (SearchHit searchHit : searchHits) {
String sourceAsString = searchHit.getSourceAsString();
list.add(JSON.parseObject(sourceAsString, HotelEntity.class));
}
resultMap.put("list", list);
resultMap.put("totalResultSize", totalHits);
resultMap.put("current", current);
//设置总页数
resultMap.put("totalPage", (totalHits + size - 1) / size);
} catch (IOException e) {
e.printStackTrace();
}
return resultMap;
}
HotelServiceImpl.java
5.迁移失败
如果数据前移失败,多半是ES设置的映射有问题,可以从responses中打印出错信息;
package com.hmall.search.feign;
import com.alibaba.fastjson.JSON;
import com.alibaba.fastjson.serializer.SerializerFeature;
import com.hmall.client.ItemClient;
import com.hmall.common.dto.Item;
import com.hmall.common.dto.PageDTO;
import com.hmall.search.domain.ItemDoc;
import org.elasticsearch.action.bulk.BulkItemResponse;
import org.elasticsearch.action.bulk.BulkRequest;
import org.elasticsearch.action.bulk.BulkResponse;
import org.elasticsearch.action.index.IndexRequest;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.common.xcontent.XContentType;
import org.junit.Test;
import org.junit.runner.RunWith;
import org.springframework.beans.BeanUtils;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;
import org.springframework.test.context.junit4.SpringRunner;
import java.io.IOException;
@RunWith(SpringRunner.class)
@SpringBootTest
public class FeignTest {
@Autowired
private RestHighLevelClient restHighLevelClient;
@Autowired
private ItemClient itemClient;
@Test
public void testFindItem() {
//1次插入多少条
Integer size = 1000;
PageDTO<Item> itemPageDTO = itemClient.queryItemByPage(1, 0);
Long total = itemPageDTO.getTotal();
Long totalPage = total % size == 0 ? total / size : total / size + 1;
for (Long currentPage = 1L; currentPage <= totalPage; currentPage++) {
System.out.println("第" + currentPage + "页");
itemPageDTO = itemClient.queryItemByPage(currentPage.intValue(), size);
//批量导入
//ES批量导入的API:请求集合
BulkRequest bulkRequest = new BulkRequest();
for (Item item : itemPageDTO.getList()) {
ItemDoc itemDoc = new ItemDoc();
BeanUtils.copyProperties(item, itemDoc);
String data = JSON.toJSONStringWithDateFormat(itemDoc, "yyyy-MM-dd", SerializerFeature.WriteDateUseDateFormat);
IndexRequest indexRequest = new IndexRequest("item").source(data, XContentType.JSON);
bulkRequest.add(indexRequest);
}
try {
BulkResponse bulkItemResponses = restHighLevelClient.bulk(bulkRequest, RequestOptions.DEFAULT);
if (bulkItemResponses.hasFailures()) {
BulkItemResponse[] itemResponse = bulkItemResponses.getItems();
for (BulkItemResponse response : itemResponse) {
if (response.isFailed()) {
System.out.println("=======" + response.getFailureMessage() + "=============");
}
}
}
} catch (IOException e) {
e.printStackTrace();
}
}
System.out.println(itemPageDTO.getList());
}
@Test
public void testFindItem1() {
}
}
FeignTest.java
五、Go操作ElasticSearch
Go语言和Ptython都可以使用其第3方库https://github.com/olivere/elastic来连接并操作ES。
注意APICleint的版本与你的ES版本相一致
例如:我们这里使用的ES是7.8.0的版本,那么我们下载的client也要与之对应为github.com/olivere/elastic/v7
。
1.引入依赖
使用go.mod
来管理依赖下载指定版本的第三库:
module go相关模块/elasticsearch
go 1.13
require github.com/olivere/elastic/v7 v7.0.4
2.代码
package main
import (
"context"
"fmt"
"github.com/olivere/elastic/v7"
)
// Elasticsearch demo
type Person struct {
Name string `json:"name"`
Age int `json:"age"`
Married bool `json:"married"`
}
func main() {
client, err := elastic.NewClient(elastic.SetURL("http://192.168.56.135:9200/"))
if err != nil {
// Handle error
panic(err)
}
fmt.Println("connect to es success")
p1 := Person{Name: "曹操", Age: 155, Married: true}
put1, err := client.Index().
Index("students").Type("go").
BodyJson(p1).
Do(context.Background())
if err != nil {
// Handle error
panic(err)
}
fmt.Printf("Indexed user %s to index %s, type %s\n", put1.Id, put1.Index, put1.Type)
}
Python
"""
pip install elasticsearch==7.8.0
http://10.110.158.162:10072/ ES地址
http://10.110.158.162:10937/ Kibana地址
"""
from elasticsearch import Elasticsearch
clent = Elasticsearch(hosts="http://10.110.158.162:10072")
query = {
"_source": {
"includes": ["@timestamp", "message", "stream"]
},
"sort": [
{
"@timestamp": {
"order": "desc"
}
}
],
"from": 1,
"size": 3
}
allDoc = clent.search(index="k8s-2022.12.26", body=query)
for row in allDoc["hits"]["hits"]:
print(row["_source"])
es.py
六、日志查询
#查询当前ES数据库中存在所有索引
GET _cat/indices
#查询k8s-2022.12.26索引中定义的字段类型
GET k8s-2022.12.26/_mapping
#根据ID查询1个文档
GET k8s-2022.12.26/_doc/Q3YKTYUBy4Ru3dTnzJs0
#查询k8s-2022.12.26索引中的300条日志
GET k8s-2022.12.26/_search
{
"query":{
"match": {"kubernetes.container_name":"nginx"}
},
"sort": [
{
"@timestamp": {
"order": "desc"
}
}
],
"from": 0,
"size": 300
}
#针对text字段match查询tag=linux-messages的日志
GET k8s-2022.12.26/_search
{
"query": {
"match": {"tag":"linux-messages"}
},
"sort": [
{
"@timestamp": {
"order": "desc"
}
}
],
"from": 0,
"size": 300
}
#查询tag=linux-messages以及host_ip=10.110.158.162的日志,指定显示_source的部分字段
GET k8s-2022.12.26/_search
{
"_source": {
"includes":["@timestamp","host_ip"]
},
"query": {
"bool":{
"must":[
{"match":{"tag":"linux-messages"}}
]
}},
"from": 0,
"size": 20
}