通过这篇文章,了解ES 如何源码启动、如何定位对应请求的实现类。

1. 准备环境

Jdk: 17

Es: 7.17

IDEA: 2024.1

Gradle: 8.7

  1. 安装jdk、idea
  2. 下载es 源码: (我从github 下载的7.17.8 的代码)
    https://github.com/elastic/elasticsearch 或者: https://gitee.com/mirrors/elasticsearch
  3. gradle下载(这一步也可以跳过)

其实就是让gradle 默认走本地文件,不然下载比较慢。

1. elasticsearch源码\gradle\wrapper\gradle-wrapper.properties
distributionBase=GRADLE_USER_HOME
distributionPath=wrapper/dists
distributionUrl=https\://services.gradle.org/distributions/gradle-7.5.1-all.zip
zipStoreBase=GRADLE_USER_HOME
zipStorePath=wrapper/dists
distributionSha256Sum=db9c8211ed63f61f60292c69e80d89196f9eb36665e369e7f00ac4cc841c2219
2. https\://services.gradle.org/distributions/gradle-7.5.1-all.zip 下载
3. 放置 gradle-7.5.1-all.zip 到elasticsearch\gradle\wrapper
4. 修改gradle-wrapper.properties
distributionUrl=gradle-7.5.1-all.zip
  1. 修改全局gradle仓库地址
    USER_HOME/.gradle/下面创建新文件 init.gradle(没有这个文件的可以手动创建),输入下面的内容并保存。
    修改gradle的远程仓库地址为阿里云的仓库
allprojects{
    repositories {
        def ALIYUN_REPOSITORY_URL = 'https://maven.aliyun.com/repository/public/'
        def ALIYUN_GRADLE_PLUGIN_URL = 'https://maven.aliyun.com/repository/gradle-plugin/'
        all { ArtifactRepository repo ->
            if(repo instanceof MavenArtifactRepository){
                def url = repo.url.toString()
                if (url.startsWith('https://repo1.maven.org/maven2/')) {
                    project.logger.lifecycle "Repository ${repo.url} replaced by $ALIYUN_REPOSITORY_URL."
                    remove repo
                }
                if (url.startsWith('https://jcenter.bintray.com/')) {
                    project.logger.lifecycle "Repository ${repo.url} replaced by $ALIYUN_REPOSITORY_URL."
                    remove repo
                }
                if (url.startsWith('https://plugins.gradle.org/m2/')) {
                    project.logger.lifecycle "Repository ${repo.url} replaced by $ALIYUN_GRADLE_PLUGIN_URL."
                    remove repo
                }
            }
        }
        maven { url ALIYUN_REPOSITORY_URL }
        maven { url ALIYUN_GRADLE_PLUGIN_URL }
    }
}

2. IDEA 运行

1. 环境准备

  1. IDEA 导入源码项目

File->Open->选中es根目录进入导入

  1. project struct 设置项目SDK, 这里选择idea 自带的默认的17

es源码启动_java

  1. 设置gradle 的编译环境

perference 搜索gradle:

es源码启动_Elastic_02

2. 开始编译

  1. 编译源码

导入IDEA 之后右下角会弹窗load gradle project,如果自己没点,可以点gradle然后手动Reload

es源码启动_java_03

点击完成之后需要等待一段时间,build 比较费时间。

这里不需要自己设置子项目为 gradle 项目,我在一开始还自己设置了,在自己 reload all projects 的时候会自动加载子项目。

  1. 构建发布包

操作:根据自己的操作系统,选择对应的 no-jdk-*-tar 的 build 按钮,构建 Elasticsearch 发布包。

es源码启动_Elastic_04

构建完成:在对应的 xxx-tar 目录会有相应的build 目录以及文件

es源码启动_elasticsearch_05

构建原因:distribution/archives/no-jdk-darwin-aarch64-tar/build/install/elasticsearch-7.17.8-SNAPSHOT 目录下会有许多模块, Elasticsearch 采用模块化,所以我们在改动到 modules 模块的代码时,都需要重新 build 一次,即使只添加了代码注释。否则,IDEA Debug 调试时,代码行号会对应不上哈。

构建的过程中,发现资源下载失败:

错误信息如下:
Could not determine the dependencies of task ':x-pack:plugin:ml:bundlePlugin'.

Could not resolve all task dependencies for configuration ':x-pack:plugin:ml:nativeBundle'.
Could not resolve org.elasticsearch.ml:ml-cpp:7.17.8-SNAPSHOT.
Required by:
project :x-pack:plugin:ml
> Could not resolve org.elasticsearch.ml:ml-cpp:7.17.8-SNAPSHOT.
> Could not get resource 'https://artifacts-snapshot.elastic.co/ml-cpp/7.17.8-SNAPSHOT/downloads/ml-cpp/ml-cpp-7.17.8-SNAPSHOT.zip'.
> Could not HEAD 'https://artifacts-snapshot.elastic.co/ml-cpp/7.17.8-SNAPSHOT/downloads/ml-cpp/ml-cpp-7.17.8-SNAPSHOT.zip'.

Connect to 127.0.0.1:33210 [/127.0.0.1] failed: Connection refused

解决办法: 参考 https://github.com/elastic/elasticsearch/issues/48350

修改elasticsearch-7.17.8/x-pack/plugin/ml/build.gradle文件:

es源码启动_Elastic_06

最终下载地址:

https://prelert-artifacts.s3.amazonaws.com/maven/org/elasticsearch/ml/ml-cpp/7.17.8-SNAPSHOT/ml-cpp-7.17.8-SNAPSHOT.zip

ps: 如果下载失败,可能需要FQ,或者自己下载下载修改该文件走localRepo 的逻辑。

3. 源码启动

0. 源码简介

整个es java 源代码大概233W行,可以想象如果想弄清楚是多么的复杂。

es 采用模块化操作, server 是 和服务端的主要程序; transport-netty4 模块是 Elasticsearch 基于 Netty 实现网络通讯,我们常用的 9200 或 9300 就是由它提供的。

程序的启动入口在: server/src/main/java/org/elasticsearch/bootstrap/Elasticsearch.java

接收前端的请求在包:server/src/main/java/org/elasticsearch/action

1. 相关文件修改

  1. 修改主启动类:

server 工程下 org.elasticsearch.bootstrap.Elasticsearch#main(java.lang.String[]), main 方法开头增加:

String esHome = "/Users/xxx/app/xm/es_source/elasticsearch-7.17.8/distribution/archives/no-jdk-darwin-aarch64-tar/build/install/elasticsearch-7.17.8-SNAPSHOT"; // 自己build出来的文件基路径
        System.setProperty("es.path.home", esHome); // 设置 Elasticsearch 的【根】目录
        System.setProperty("es.path.conf", esHome + "/config");  // 设置 Elasticsearch 的【配置】目录
        System.setProperty("log4j2.disable.jmx", "true"); // 禁用 log4j2 的 JMX 监控,避免报错
        System.setProperty("java.security.policy", esHome + "/config/java.policy"); // 设置 Java 的安全策略
  1. distribution/archives/no-jdk-darwin-aarch64-tar/build/install/elasticsearch-7.17.8-SNAPSHOT/config/elasticsearch.yml 文件增加:
node.name: node-1 # 设置 ES 节点名
xpack.security.enabled: false # 禁用 X-Pack 提供的安全认证功能,方便测试
ingest.geoip.downloader.enabled: false # 先关闭geoip库的更新

启动之后如果报磁盘水位的问题:

1. 问题:
[node-1] high disk watermark [90%] exceeded on [eo6zdEm8RWWOodoaSMXNXw][node-1][/Users/xxx/Desktop/es_file/es-7.17.8/0/data/nodes/0] free: 18.9gb[8.3%], shards will be relocated away from this node; currently relocating away shards totalling [0] bytes; the node is expected to continue to exceed the high disk watermark when these relocations are complete

2. 修复方案:上面文件继续添加
cluster.routing.allocation.disk.threshold_enabled: false
  1. distribution/archives/no-jdk-darwin-aarch64-tar/build/install/elasticsearch-7.17.8-SNAPSHOT/config 新增文件java.policy
grant {
    permission java.security.AllPermission;
};
  1. server/src/main/resources/org/elasticsearch/bootstrap/security.policy 文件删掉codeBase 相关:

es源码启动_elasticsearch_07

2. 启动

  1. 不指定data、logs目录方法启动

运行主类 server 模块下: org.elasticsearch.bootstrap.Elasticsearch#main(java.lang.String[])

会看到日志:

es源码启动_elasticsearch_08

访问9200:

xxxx % curl localhost:9200/
{
  "name" : "node-1",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "V3cJUOHbQA2ZqeHZO67JdA",
  "version" : {
    "number" : "7.17.8",
    "build_flavor" : "unknown",
    "build_type" : "unknown",
    "build_hash" : "unknown",
    "build_date" : "unknown",
    "build_snapshot" : true,
    "lucene_version" : "8.11.1",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}
  1. 指定data、logs目录方法启动

org.elasticsearch.cli.EnvironmentAwareCommand#execute(org.elasticsearch.cli.Terminal, joptsimple.OptionSet) 这里可以看到给es 传变量可以有两种方式:

第一种是代码启动的环境变量设置: es.path.data, org.elasticsearch.bootstrap.Elasticsearch#main(java.lang.String[]) 增加

// 设置data目录和日志文件目录
        System.setProperty("es.path.data", "/Users/xxx/Desktop/es_file/es-7.17.8/0/data"); // 设置 Elasticsearch 的【根】目录
        System.setProperty("es.path.logs", "/Users/xxx/Desktop/es_file/es-7.17.8/0/logs");  // 设置 Elasticsearch 的【配置】目录

第二种是程序参数加: -Epath.logs=xxx

-Epath.data=/Users/xxx/Desktop/es_file/es-7.17.8/0/data -Epath.logs=/Users/xxx/Desktop/es_file/es-7.17.8/0/logs

3. 创建&查看索引、插入数据debug

0. 逻辑解释

  1. Elasticsearch 提供 RESTful API,对应到源码就是 server 项目下的 action 包
  2. 每个 API 转发到对应的 TransportXXXAction 的实现类,进行相应的代码逻辑。而 TransportXXXAction 需要在 ActionModule 中进行注册。

1. 创建索引

对应的类是:TransportCreateIndexAction

下断点:

es源码启动_Elastic_09

调用:

curl -X PUT -H 'Content-Type:application/json' -d '{"mappings":{"properties":{"name":{"type":"keyword"},"age":{"type":"long"},"address":{"type":"text","analyzer":"standard"},"location":{"type":"geo_point"},"birth_date":{"type":"date"},"birth_date_value":{"type":"long"},"likes":{"type":"keyword"},"well_person":{"type":"boolean"},"salary":{"type":"integer_range"},"school":{"type":"wildcard"},"feature":{"type":"nested","properties":{"height":{"type":"double"},"weight":{"type":"double"}}}}}}' localhost:9200/qlq_user

会进入自己的断点,说明成功。

2. 查看索引

对于固定的url,可以用路径uri 进行搜索

对应类:org.elasticsearch.rest.action.cat.RestIndicesAction#doCatRequest

xxx % curl localhost:9200/_cat/indices
yellow open qlq_user OwnK3cMUT2-L7Rog062oHA 1 1 0 0 226b 226b

3. 新增文档

对应方法: org.elasticsearch.action.bulk.TransportShardBulkAction#dispatchedShardOperationOnPrimary

请求

curl -X POST -H 'Content-Type:application/json' -d '{"name":"张三","school":"Beijing Xicheng Middle School","age":30,"address":"北京市朝阳区","location":{"lat":39.9075,"lon":116.39723},"birth_date":"1990-01-01","birth_date_value":631120800000,"likes":["读书","旅行"],"feature":[{"height":175.5,"weight":70.0}],"salary":{"gte":5000,"lte":10000},"well_person":true}' localhost:9200/qlq_user/_doc/

4. 查询文档

1. 查询总数

接口:org.elasticsearch.action.search.TransportSearchAction#executeRequest

测试:

xxx % curl localhost:9200/qlq_user/_count
{"count":1,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0}}%

2. 查询数据

接口: org.elasticsearch.action.search.TransportSearchAction#executeRequest

测试:

xxx % curl -X GET -H 'Content-Type:application/json' -d '{"query":{"term":{"likes":{"value":"读书"}}}}' localhost:9200/qlq_user/_search

{"took":4,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":1,"relation":"eq"},"max_score":0.3616575,"hits":[{"_index":"qlq_user","_type":"_doc","_id":"Q38ee48BpAUI2PvOZWk9","_score":0.3616575,"_source":{"name":"张三","school":"Beijing Xicheng Middle School","age":30,"address":"北京市朝阳区","location":{"lat":39.9075,"lon":116.39723},"birth_date":"1990-01-01","birth_date_value":631120800000,"likes":["读书","旅行"],"feature":[{"height":175.5,"weight":70.0}],"salary":{"gte":5000,"lte":10000},"well_person":true}}]}}

5. 删除索引

接口:

org.elasticsearch.action.admin.indices.delete.TransportDeleteIndexAction#doExecute

测试:

xxx % curl -X DELETE localhost:9200/qlq_user
{"acknowledged":true}

4. 错误:

  1. Gradle JVM 参数错误

错误信息:

Unrecognized option: --add-exports
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.

-----------------------
Check the JVM arguments defined for the gradle process in:
 - gradle.properties in project root directory

原因:我一开始用的JDK8 版本比较低,导致JVM参数不符合。

修复:调整为高版本JDK,我这里用17.

  1. 编译相关tar 报错
错误:
Could not determine the dependencies of task ':x-pack:plugin:ml:bundlePlugin'.
> Could not resolve all task dependencies for configuration ':x-pack:plugin:ml:nativeBundle'.
   > Could not resolve org.elasticsearch.ml:ml-cpp:7.17.8-SNAPSHOT.
     Required by:
         project :x-pack:plugin:ml
      > Could not resolve org.elasticsearch.ml:ml-cpp:7.17.8-SNAPSHOT.
         > Could not get resource 'https://artifacts-snapshot.elastic.co/ml-cpp/7.17.8-SNAPSHOT/downloads/ml-cpp/ml-cpp-7.17.8-SNAPSHOT.zip'.
            > Could not HEAD 'https://artifacts-snapshot.elastic.co/ml-cpp/7.17.8-SNAPSHOT/downloads/ml-cpp/ml-cpp-7.17.8-SNAPSHOT.zip'.
               > Connect to 127.0.0.1:33210 [/127.0.0.1] failed: Connection refused
解决办法:

5. 源码以集群方式启动

启动三个节点, 原来shell 脚本启动方式如下:

sh elasticsearch -Ehttp.port=9200 -Epath.data=/Users/qiao-zhi/app/software/elk/data/0 -Epath.logs=/Users/qiao-zhi/app/software/elk/log/0 -Enode.roles=data 
sh elasticsearch -Ehttp.port=9201 -Epath.data=/Users/qiao-zhi/app/software/elk/data/1 -Epath.logs=/Users/qiao-zhi/app/software/elk/log/1 -Enode.roles=master 
sh elasticsearch -Ehttp.port=9202 -Epath.data=/Users/qiao-zhi/app/software/elk/data/2 -Epath.logs=/Users/qiao-zhi/app/software/elk/log/2
  1. 代码去掉指定data 和 log 目录
// 设置data目录和日志文件目录
//        System.setProperty("es.path.data", "/Users/xxx/Desktop/es_file/es-7.17.8/0/data"); // 设置 Elasticsearch 的【根】目录
//        System.setProperty("es.path.logs", "/Users/xxx/Desktop/es_file/es-7.17.8/0/logs");  // 设置 Elasticsearch 的【配置】目录
  1. JVM 启动参数设置(允许并发执行)
-Ehttp.port=9201 -Enode.name=node1 -Epath.data=/Users/xxx/Desktop/es_file/es-7.17.8/1/data -Epath.logs=/Users/xxx/Desktop/es_file/es-7.17.8/1/log -Enode.roles=master

-Ehttp.port=9200 -Enode.name=node2 -Epath.data=/Users/xxx/Desktop/es_file/es-7.17.8/0/data -Epath.logs=/Users/xxx/Desktop/es_file/es-7.17.8/0/log -Enode.roles=data

-Ehttp.port=9202 -Enode.name=node3 -Epath.data=/Users/xxx/Desktop/es_file/es-7.17.8/2/data -Epath.logs=/Users/xxx/Desktop/es_file/es-7.17.8/2/log
  1. 启动后查看集群信息
GET /_cat/nodes?v
---
ip        heap.percent ram.percent cpu load_1m load_5m load_15m node.role   master name
127.0.0.1            3          99  35    3.53                  d           -      node2
127.0.0.1            3          99  35    3.53                  cdfhilmrstw -      node3
127.0.0.1            4          99  35    3.53                  m           *      node1

参考

https://www.iocoder.cn/Elasticsearch/build-debugging-environment/

【当你用心写完每一篇博客之后,你会发现它比你用代码实现功能更有成就感!】