配置es yum源 es安装配置

转载

mob6454cc636c54 2024-03-28 18:10:36

文章标签 配置es yum源 elasticsearch 搜索引擎大数据自定义 文章分类 架构后端开发

Es作为elastic stack家族中举足亲重的一员的原因在于：它的搜索功能非常强大，实际上它就是一个分布式的搜索引擎。

安装

本文的安装基于ubuntu20.04TLS虚拟机，并采用.tar.gz的模式，

使用tar -xzf命令解压下载的文件后，找到配置文件：elasticsearch.yml

将节点名称打开：node.name: node-1 前面的#删掉
network.host: 0.0.0.0 也可以配置成自己的ip地址，默认是localhost，外机不能访问
cluster.initial_master_nodes:[“node-1”] 这里的名称要跟上面的节点名称一致
在文件的末尾加上如下两个语句，以便在浏览器中可以访问<未尝试>
http.cors.enabled: true
http.cors.allow-origin: “*”

由于是虚拟机执行，所以可能碰到提示虚拟机内存过小问题，解决方案为：
vim /etc/sysctl.conf 增加一行：vm.max_map_count=655360 (虚拟机内存大小至少为：262144)
增加完后，按:wq命令保存，并输入sysctl -p刷新应用。

Es是有java开发的，所以需要设置jdk，新版本默认有自带jdk，我这里的es默认需要jdk版本为11，所以如果你电脑上安装的是jdk8的话可能就需要进行升级处理。另外的解决方案是使用自带的jdk。

正常情况下，es没有在环境变量中找到jdk的配置时，将直接使用自带的。删除环境变量的命令为：unset <环境变量名>

最后，开始启动es步骤: 进入bin目录，执行elasticsearch文件。如需后台执行，请使用：nohup & 。

使用

实际使用过程中，我们会利用模板来创建index跟data stream 。
ES模板有两种：

index template
component template

使用的时候，则是将conponent template引用到index template中组合使用。创建index或data stream的时候，如果匹配到index template中的partten，则会利用template中的设置来配置对应的index 。

另外重要的一点是：es的数据存储时间。
我们可以创建policy然后同样引用到index template中来配置index的滚动存储。

由于滚动存储时，一个index可能会产生多个index，但是我们没有办法在应用程序中对应的动态去更改对应的index名称（如果使用了应用程序调用REST API的话）。

所以: alias (别名) 出场了。
一个alias里面可以对应多个index。滚动存储的时候只需在index template中配置好alias名称，后续的操作都针对这个别名来操作即可解决上述问题。

开始创建index template需要的组件

1. component template

component template可以创建很多个不通的种类，根据需要组合到index template中。
创建component template对应的REST API为：

PUT http://192.168.37.128:9200/_component_template/<自定义component template名称>
//底下json为body请求体
{
   "template":{
       "mappings":{
           "properties":{
               "address":{
                   "type":"keyword"
               }
           }
       }
   }
}

192.168.37.128是我本地虚拟机Ip，9200是es默认对外的端口号。（9300是默认es集群内部使用的端口号）

创建成功后会返回：

{
    "acknowledged": true
}

到时候如果将该component组合进index template，该component中定义的mapping将同步应用到适配index template的index 。

获取所有component template api:

GET http://192.168.37.128:9200/_component_template/

2.policy

policy使用es的生命周期管理(index lifecycle manage简称：ILM)，且可以用来对数据的动态管理。对于数据体积不是很大的时候，我们不需要配置它全部的周期。
ILM对生命周期的定义为五个阶段，分别是：

Hot: The index is actively being updated and queried.
Warm: The index is no longer being updated but is still being queried.
Cold: The index is no longer being updated and is queried infrequently. The information still needs to be searchable, but it’s okay if those queries are slower.
Frozen: The index is no longer being updated and is queried rarely. The information still needs to be searchable, but it’s okay if those queries are extremely slow.
Delete: The index is no longer needed and can safely be removed.

一般使用过程中，我们仅需要定义Hot 跟 Delete阶段即可。

创建policy的REST API为：

PUT http://192.168.37.128:9200/_ilm/policy/<自定义policy名称>
//body 请求体
{
    "policy":{
        "phases":{
            "hot":{
                "actions":{
                    "rollover":{
                    //每个Index存储达到3条数据则出发rollover，进入下个配
                    //置的阶段：delete 删除当前整个index,并生成下个index
                    //也就是，就算当前index有10条数据，也会一并删除
                        "max_docs":"3",  
                        "max_age":"7d" //如果上面的没有满足，满足了该index已经存在7天。则也将触发rollover，并进入下个阶段。
                    }
                }
            },
            "delete":{
                "min_age":"0ms", //默认值，触发后立即执行下面的删除操作（删除Index中对应的满足配置的数据，如：max_docs条数，或者存储时间达到7天）
                "actions":{
                    "delete":{}
                }
            }
        }
    }
}

结果：

{
    "acknowledged": true
}

rollover是定时执行，默认时间为10分钟。所以你在测试的时候如果没有达到这个时间间隔，你可能会发现条件满足了，但是并没有如预期那样进入下个阶段

上面的配置是在测试的配置，正常生产环境你肯定不能配置的这么短小。测试的时候这样配置，我们能很方便的测到该配置是否是正常。

测试的时候如有需要，更改rollover执行间隔api为：

PUT http://192.168.37.128:9200/_cluster/settings
//body请求体
{
     "transient": {
      "indices.lifecycle.poll_interval": "20s" //20s执行一次
    }
}

同时，我们也可以利用api查看rollover操作的结果：

GET http://192.168.37.128:9200/_cat/shards/times-*  //times-*是index template配置的partten

结果示例：

times-000005 0 p STARTED    1 3.8kb 192.168.37.128 node-1
times-000005 0 r UNASSIGNED

好了，到目前位置，我们所需要的component template跟policy都已经创建完毕。接下来就可以创建index template了。

index template

index template中可以通过增加data_stream: {}来决定该index template是应用在index创建，还是data stream创建。
创建index template api为：

PUT http://192.168.37.128:9200/_index_template/<自定义index template名称>
//request body
{
//只要在创建index时，满足名称times-为前缀的都会应用该template
  "index_patterns":["times-*"], 
  //"data_stream":{}, //如果有写这个，则该template应用到data stream，否则应用index
  "template":{
      "settings":{
          "number_of_shards":1,
          "number_of_replicas":1,
          "index.lifecycle.name": "times-policy", //刚刚创建的Policy名称
          "index.lifecycle.rollover_alias": "times" //别名，后续使用该名称做Index使用
      }
  },
  "priority":50, //由于es存在内置的template，默认优先级是100，这里大于100则优先使用这里的，小于则优先使用内置的
  //前面创建的两个component template
  "composed_of":["component_template1","component_template2"],
  "version":1
}

然后在创建第一个index时候，需要注明是写索引，该设置不能直接放在index template中，否则不起效。

创建index:

//标准命名为：后面加6位数字：5个0 1个1
PUT http://192.168.37.128:9200/tests-000001
//reqeust body
{
    "aliases":{
        "tests":{
            "is_write_index":true
        }
    }
}

在完成这个操作之后，后续的针对index添加数据，都是使用指定的alias来操作增删改查

另记：
对于创建data stream的时候，需要直接在index template中指定is_write_index。

{
  "index_patterns":["tests-*"],
  "data_stream":{},
  "template":{
      "settings":{
          "number_of_shards":1,
          "number_of_replicas":1,
          "index.lifecycle.name": "times-policy", 
          "index.lifecycle.rollover_alias": "tests" 
      },
      "aliases":{
        "tests":{
            "is_write_index":true
        }
    }
  },
  "priority":50,
  "composed_of":["component_template1","component_template2"],
  "version":1
}

直接创建data stream后，就可以应用上面的配置：
创建data stream:

PUT http://192.168.37.128:9200/_data_stream/tests-000001

添加数据到data stream：

POST http://192.168.37.128:9200/tests/_bulk?refresh

数据成功添加到data stream中后，返回的索引会显示，当前索引类似：
.ds-tests-000001-2021.11.11-000001，每rollover一次，最后面的值+1 。

查看data stream中数据：

GET http://192.168.37.128:9200/tests/_doc/_search

后记：
索引名称必须符合下列规则：
只能小写，不能包含：, /, *, ?, ", <, >, |, 空格，, #
不能以-, _, +开头
不能是.或者…
长度不能大于255字节

本文章为转载内容，我们尊重原作者对文章享有的著作权。如有内容错误或侵权问题，欢迎原作者联系我们进行内容更正或删除文章。

上一篇：黑马springboot源码文档黑马springboot笔记

下一篇：什么是bgpv4 什么是bgp发言人

提问和评论都可以，用心的回复会被更多人看到评论

发布评论

相关文章

官方博客	全部文章	热门标签	班级博客
了解我们	网站地图	意见反馈

鸿蒙开发者社区	51CTO学堂
51CTO	软考资讯