前言

  **Elasticsearch是一个开源的分布式、RESTful 风格的搜索和数据分析引擎,它的底层是开源库Apache Lucene。
  Lucene 可以说是当下最先进、高性能、全功能的搜索引擎库——无论是开源还是私有,但它也仅仅只是一个库。为了充分发挥其功能,你需要使用 Java 并将 Lucene 直接集成到应用程序中。 更糟糕的是,您可能需要获得信息检索学位才能了解其工作原理,因为Lucene 非常复杂。
  为了解决Lucene使用时的繁复性,于是Elasticsearch便应运而生。它使用 Java 编写,内部采用 Lucene 做索引与搜索,但是它的目标是使全文检索变得更简单,简单来说,就是对Lucene 做了一层封装,它提供了一套简单一致的 RESTful API 来帮助我们实现存储和检索。
特点:

  • 一个分布式的实时文档存储,每个字段可以被索引与搜索;
  • 一个分布式实时分析搜索引擎;
  • 能胜任上百个服务节点的扩展,并支持 PB 级别的结构化或者非结构化数据。**

一、Spring Boot 集成 ElasticSearch 7.6.0

pom.xml

<dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-data-elasticsearch</artifactId>
        </dependency>

        <!-- https://mvnrepository.com/artifact/org.elasticsearch.client/transport -->
        <dependency>
            <groupId>org.elasticsearch.client</groupId>
            <artifactId>transport</artifactId>
            <version>7.6.0</version>
        </dependency>

        <!--    java解析网页    -->
        <dependency>
            <groupId>org.jsoup</groupId>
            <artifactId>jsoup</artifactId>
            <version>1.10.2</version>
        </dependency>

ElasticsearchConfig

/**
 * @Author: Lanys
 * @Description:
 * @Date: Create in 22:07 2021/7/27
 */
@Configuration
public class ElasticsearchConfig {

    @Bean
    public RestHighLevelClient restHighLevelClient(){
        RestHighLevelClient restHighLevelClient = new RestHighLevelClient(
                RestClient.builder(new HttpHost("127.0.0.1",9200,"http")));
        return restHighLevelClient;
    }
}

EsJdController

/**
 * @author lanys
 * @Description:
 * @date 28/7/2021 上午9:24
 */

@RestController
public class EsJdController {

    @Autowired
    private EsJdService esJdService;

    @GetMapping("/parse/{keyword}")
    public boolean parse(@PathVariable("keyword") String keyword) throws Exception {
        return esJdService.parseContent(keyword);
    }

    @GetMapping("/search/{keyword}/{pageNo}/{pageSize}")
    public List<Map<String, Object>> search(@PathVariable("keyword") String keyword,
                                            @PathVariable("pageNo") Integer pageNo,
                                            @PathVariable("pageSize") Integer pageSize) throws Exception {
        return esJdService.searchPage(keyword, pageNo, pageSize);
    }

    @GetMapping("/gaoliang/search/{keyword}/{pageNo}/{pageSize}")
    public List<Map<String, Object>> gaoLiangSearch(@PathVariable("keyword") String keyword,
                                                    @PathVariable("pageNo") Integer pageNo,
                                                    @PathVariable("pageSize") Integer pageSize) throws Exception {
        return esJdService.searchGaoLiangPage(keyword, pageNo, pageSize);
    }

}

EsJdService

/**
 * @author lanys
 * @Description:
 * @date 28/7/2021 上午9:26
 */
@Service
public class EsJdService {

    @Autowired
    private RestHighLevelClient restHighLevelClient;

    public boolean parseContent(String keywords) throws Exception {
        List<Content> contents = JsoupUtils.parseJD(keywords);
        BulkRequest req = new BulkRequest();


        req.timeout("3m");

        for (int i = 0; i < contents.size(); i++) {
            req.add(new IndexRequest("es_jd")
                    .source(JSON.toJSONString(contents.get(i)), XContentType.JSON));
        }
        BulkResponse resp = restHighLevelClient.bulk(req, RequestOptions.DEFAULT);
        return !resp.hasFailures();
    }


    public List<Map<String, Object>> searchPage(String keyword, int pageNo, int pageSize) throws Exception {
        if (pageNo <= 1) {
            pageNo = 1;
        }

        SearchRequest searchRequest = new SearchRequest("es_jd");
        SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();

        sourceBuilder.from(pageNo);
        sourceBuilder.size(pageSize);

        TermQueryBuilder termQueryBuilder = QueryBuilders.termQuery("title", keyword);
        sourceBuilder.query(termQueryBuilder);
        sourceBuilder.timeout(new TimeValue(60, TimeUnit.SECONDS));

        searchRequest.source(sourceBuilder);
        SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);

        ArrayList<Map<String, Object>> list = new ArrayList<>();
        for (SearchHit documentFields : searchResponse.getHits().getHits()) {
            list.add(documentFields.getSourceAsMap());
        }
        return list;
    }

    public List<Map<String, Object>> searchGaoLiangPage(String keyword, int pageNo, int pageSize) throws Exception {
        if (pageNo <= 1) {
            pageNo = 1;
        }

        SearchRequest searchRequest = new SearchRequest("es_jd");
        SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();

        HighlightBuilder highlighBuilder = new HighlightBuilder();
        highlighBuilder.field("title");
        highlighBuilder.requireFieldMatch(false);
        highlighBuilder.preTags("<span style='color:red'>");
        highlighBuilder.postTags("</span>");
        sourceBuilder.highlighter(highlighBuilder);

        sourceBuilder.from(pageNo);
        sourceBuilder.size(pageSize);

        TermQueryBuilder termQueryBuilder = QueryBuilders.termQuery("title", keyword);
        sourceBuilder.query(termQueryBuilder);
        sourceBuilder.timeout(new TimeValue(60, TimeUnit.SECONDS));

        searchRequest.source(sourceBuilder);
        SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);

        ArrayList<Map<String, Object>> list = new ArrayList<>();
        for (SearchHit documentFields : searchResponse.getHits().getHits()) {
            Map<String, HighlightField> highlightFields = documentFields.getHighlightFields();
            HighlightField title = highlightFields.get("title");
            Map<String, Object> sourceAsMap = documentFields.getSourceAsMap();
            if (title != null) {
                Text[] texts = title.fragments();
                String nTitle = "";
                for (Text text : texts) {
                    nTitle += text;
                }
                sourceAsMap.put("title", nTitle);
            }
            list.add(sourceAsMap);
        }
        return list;
    }
}

JsoupUtils(java版简单爬虫js)

/**
 * @author lanys
 * @Description:
 * @date 28/7/2021 上午9:28
 */

@Component
public class JsoupUtils {
    public static List<Content> parseJD(String keywords) throws Exception {
        String url = "https://search.jd.com/Search?keyword=" + keywords;
        Document document = Jsoup.parse(new URL(url), 30000);
        Element element = document.getElementById("J_goodsList");
        Elements elements = element.getElementsByTag("li");

        ArrayList<Content> goodsList = new ArrayList<>();

        for (Element el : elements) {
            String img = el.getElementsByTag("img").eq(0).attr("data-lazy-img");
            String title = el.getElementsByClass("p-name").eq(0).text();
            String price = el.getElementsByClass("p-price").eq(0).text();
            goodsList.add(new Content(title, price, img));
        }
        return goodsList;
    }
}

IndexController

/**
 * @author lanys
 * @Description:
 * @date 28/7/2021 上午9:25
 */
@Controller
public class IndexController {

    @GetMapping({"/", "/index"})
    public String index() {
        return "index";
    }
}

index.html

<!DOCTYPE html>
<html lang="en" xmlns:th="http://www.thymeleaf.org">

<head>
  <meta charset="utf-8"/>
  <title>Java-ES仿京东实战</title>
  
  <style type="text/css">
  
   @import url("css/style.css");

  </style>
</head>



<body class="pg">
<div class="page" id="app">
  <div id="mallPage" class="mallist tmall- page-not-market">

    <!-- 头部搜索 -->
    <div id="header" class="header-list-app">
      <div class="headerLayout">
        <div class="headerCon">
          <!-- Logo-->
          <h1 id="mallLogo">
            <img th:src="@{/images/jdlogo.png}" alt="">
          </h1>

          <div class="header-extra">

            <!--搜索-->
            <div id="mallSearch" class="mall-search">
              <form name="searchTop" class="mallSearch-form clearfix">
                <fieldset>
                  <legend>天猫搜索</legend>
                  <div class="mallSearch-input clearfix">
                    <div class="s-combobox" id="s-combobox-685">
                      <div class="s-combobox-input-wrap">

                        <input v-model='keyword' type="text" autocomplete="off" value=""
                               id="mq" class="s-combobox-input" aria-haspopup="true">

                      </div>
                    </div>
                    <button @click.prevent='searchKey()' type="submit" id="searchbtn">搜索</button>
                  </div>
                </fieldset>
              </form>
              <ul class="relKeyTop">
                <li><a>Java</a></li>
                <li><a>前端</a></li>
                <li><a>Linux</a></li>
                <li><a>大数据</a></li>
                <li><a>聊理财</a></li>
              </ul>
            </div>
          </div>
        </div>
      </div>
    </div>

    <!-- 商品详情页面 -->
    <div id="content">
      <div class="main">
        <!-- 品牌分类 -->
        <form class="navAttrsForm">
          <div class="attrs j_NavAttrs" style="display:block">
            <div class="brandAttr j_nav_brand">
              <div class="j_Brand attr">
                <div class="attrKey">
                  品牌
                </div>
                <div class="attrValues">
                  <ul class="av-collapse row-2">
                    <li><a href="#"> </a></li>
                    <li><a href="#"> Java </a></li>
                  </ul>
                </div>
              </div>
            </div>
          </div>
        </form>

        <!-- 排序规则 -->
        <div class="filter clearfix">
          <a class="fSort fSort-cur">综合<i class="f-ico-arrow-d"></i></a>
          <a class="fSort">人气<i class="f-ico-arrow-d"></i></a>
          <a class="fSort">新品<i class="f-ico-arrow-d"></i></a>
          <a class="fSort">销量<i class="f-ico-arrow-d"></i></a>
          <a class="fSort">价格<i class="f-ico-triangle-mt"></i><i class="f-ico-triangle-mb"></i></a>
        </div>

        <!-- 商品详情 -->
        <div class="view grid-nosku">

          <div class="product" v-for='result in results'>
            <div class="product-iWrap">
              <!--商品封面-->
              <div class="productImg-wrap">
                <a class="productImg">
                  <img :src='result.img'>
                </a>
              </div>
              <!--价格-->
              <p class="productPrice">
                <em><b>¥</b>{{result.price}}</em>
              </p>
              <!--标题-->
              <p class="productTitle">
                <a v-html='result.title'></a>
              </p>
              <!-- 店铺名 -->
              <div class="productShop">
                <span>店铺: Java </span>
              </div>
              <!-- 成交信息 -->
              <p class="productStatus">
                <span>月成交<em>999笔</em></span>
                <span>评价 <a>3</a></span>
              </p>
            </div>
          </div>
        </div>
      </div>
    </div>
  </div>
</div>


<script src="js/jquery.min.js"></script>
<script src="js/axios.min.js"></script>
<script src="js/vue.min.js"></script>


<script>
  new Vue({
    el: '#app',
    data: {
      keyword: '',
      results: []
    },
    methods: {
      searchKey() {
        var keyword = this.keyword;

        // axios.get('search/' + keyword + '/1/10').then(response => {
        //     console.log(response.data)
        //     this.results = response.data
        // })

        axios.get('http://127.0.0.1:9000/material/gaoliang/search/' + keyword + '/1/10').then(response => {
          console.log(response.data)
          this.results = response.data
        })
      }
    }
  })
</script>

</body>
</html>

输入爬的数据,例如java,js等

springboot 集成 consul 配置文件详解 springboot集成lucene_搜索引擎


springboot 集成 consul 配置文件详解 springboot集成lucene_搜索引擎_02


例子创建一索引(就是这么简单粗暴):

@Resource
    private RestHighLevelClient restHighLevelClient;

    /**
     * 创建索引
     * @throws IOException
     */
    @Test
    public void Test9() throws IOException {
        //1.创建索引请求
        CreateIndexRequest request = new CreateIndexRequest("my_user1");
        //2.执行创建请求 获得响应 createIndexResponse
        CreateIndexResponse createIndexResponse =
                restHighLevelClient.indices().create(request, RequestOptions.DEFAULT);
        System.out.println(createIndexResponse);
    }

总结

  这是看B站狂神的整合,这是 ElasticSearch 的一种用法,直接给client Elasticsearch的地址直接调用,而上次讲的 ElasticSearch 6.2.2是整合的IK分词,需要接口继承ElasticsearchRepository,对应实体要有Id和标识。
  整合ik分词需继承ElasticsearchRepository,不整合可以直接client 创建