elk笔记11--快照的使用

1 快照介绍

快照是运行中es集群的一个备份,进行快照时候既可以全集群所有索引备份,也可以指定某些索引备份;
快照存储在仓库中,因此使用快照前需要注册一个仓库;
快照可以存储在本地仓库,也可以存储到远程仓库,如Amazon S3, HDFS, Microsoft Azure, Google Cloud Storage;
快照中的数据是增量进行了,即当前快照里面某个index的数据不在比它早的快照数据中,因此可以多次进行快照操作,而不用担心占用过多存储空间。

快照的流程:
客户端请求->协调节点->主节点->[数据节点1,数据节点2…数据节点n],快照涉及3类型节点;
协调节点:接收客户端请求,转发到主节点。
主节点:将创建快照相关的请求信息放到集群状态中广播下去,数据节点收到后执行数据复制;主节点同时负责在仓库中写入集群状态数据。
数据节点:负责将Lucene文件复制到仓库,并在数据复制完毕后清理仓库中与如何快照都不相关;由于数据发布在各个节点,因此复制操作必须由数据节点执行,每个数据节点将快照请求中本地存储的主分片复制到仓库。

2 快照使用

2.1 nfs 作为存储仓库

  1. 配置并挂载nfs服务
  • 在h01 上启动nfs
    在 /etc/exports 中配置nfs目录/mnt/nfs01, 并重启nfs服务端
    /mnt/nfs01 10.120.75.0/24(rw,sync,no_subtree_check)
    /etc/init.d/nfs-kernel-server restart
  • 在h01,h02,h03 上挂载nfs服务
    在/etc/fstab 中配置nfs挂载项目,并挂载
    10.120.75.102:/mnt/nfs01 /nfs01 nfs rw,async,vers=3,rsize=524288,wsize=524288,acdirmin=5,acdirmax=8,hard,proto=tcp 0 0
    mount -a
    nfs 安装和使用见笔者文章​​​shell编程笔记2–nfs挂载​
  1. 配置仓库目录信息
    在yml中添加path.repo属性,确保所有的master和data节点能感知到该目录,配置后重启所有节点
    path.repo: ["/nfs01/my_backup_es6", “/nfs01/my_backup_es7”]
  2. 创建仓库

PUT _snapshot/my_backup
{
"type": "fs",
"settings": {
"location": "/nfs01/my_backup_es6"
}
}
{
"acknowledged" : true
}
GET _snapshot/my_backup
{
"my_backup" : {
"type" : "fs",
"settings" : {
"location" : "/nfs01/my_backup_es6"
}
}
}

注意:需要先设置path.repo参数,否则执行的时候会出现如下错误:
“reason”: “[my_backup] location [/nfs01/my_backup_es6] doesn’t match any of the locations specified by path.repo because this setting is empty”

  1. 创建快照
    创建索引,并向其写入3条数据,此处依次_doc/1,2,3, 其对应的value为1,2,3,

PUT /%3Csnap001-%7Bnow%2Fd%7D-000001%3E 
{
"aliases": {
"snap001_write": {}
}
}
PUT snap001-2020.08.09-000001/_doc/1
{
"name":"index1",
"value":1
}

创建快照:

PUT _snapshot/my_backup/snap001_1?wait_for_completion=true
{
"indices": "snap001-*",
"ignore_unavailable": true,
"include_global_state": true
}
若设置"indices": "*", 则快照所有索引,输出结果如下:
{
"snapshot" : {
"snapshot" : "snap001_1",
"uuid" : "uFBXg1p8QYiruL5QYOue_g",
"version_id" : 6080899,
"version" : "6.8.8",
"indices" : [
"snap001-2020.08.09-000001"
],
"include_global_state" : true,
"state" : "SUCCESS",
"start_time" : "2020-08-09T06:17:35.039Z",
"start_time_in_millis" : 1596953855039,
"end_time" : "2020-08-09T06:17:36.112Z",
"end_time_in_millis" : 1596953856112,
"duration_in_millis" : 1073,
"failures" : [ ],
"shards" : {
"total" : 2,
"failed" : 0,
"successful" : 2
}
}
}
快照成功后,共享目录生成如下内容:
$ ls

  1. 从快照恢复
    这里直接先删除索引,然后再从快照恢复

delete snap001-2020.08.09-000001
POST _snapshot/my_backup/snap001_1/_restore
{
"indices": "snap001-*"
}
{
"accepted" : true
}

2.2 hdfs 作为存储仓库

  1. 安装hdfs插件
    下载对应版本的插件安装包,通过bin/elasticsearch-plugin install plugin-name的方式安装
    wget ​​​https://artifacts.elastic.co/downloads/elasticsearch-plugins/repository-hdfs/repository-hdfs-6.8.8.zip​

soft/elk6.8.8/elasticsearch-6.8.8/bin/elasticsearch-plugin install file:///home/yourpath/repository-hdfs-6.8.8.zip 

for descriptions of what these permissions allow and the associated risks.
Continue with installation? [y/N]y
->

重启后,通过GET _cat/plugins 确认已经安装好对应的插件

  1. 注册仓库并创建hdfs目录

PUT _snapshot/my_hdfs_repository
{
"type": "hdfs",
"settings": {
"uri": "hdfs://10.120.75.102:9000/",
"path": "elasticsearch/repositories/my_hdfs_repository",
"conf.dfs.client.read.shortcircuit": "true"
}
}
{
"acknowledged" : true
}

创建hdfs目录,并设置权限为777

hadoop fs -mkdir -p /elasticsearch/repositories/my_hdfs_repository

hadoop fs -chmod 777 /elasticsearch/repositories/my_hdfs_repository

创建后,通过GET _snapshot/_all 查看当前仓库:

elk笔记11--快照的使用_索引快照

  1. 执行快照
    rollover 第二个index,并向其中写入数据
    POST snap001_write/_rollover
    文档id和value相同,分别为5,6,7

PUT snap001_write/_doc/5
{
"name":"index2",
"value":5
}

执行快照:

PUT _snapshot/my_hdfs_repository/snap001_1?wait_for_completion=true
{
"indices": "snap001-*",
"ignore_unavailable": true,
"include_global_state": true
}
{
"snapshot" : {
"snapshot" : "snap001_1",
"uuid" : "kVtq1ZLZSq-BH3erBcmb2w",
"version_id" : 6080899,
"version" : "6.8.8",
"indices" : [
"snap001-2020.08.09-000001",
"snap001-2020.08.09-000002"
],
"include_global_state" : true,
"state" : "SUCCESS",
"start_time" : "2020-08-10T07:29:54.217Z",
"start_time_in_millis" : 1597044594217,
"end_time" : "2020-08-10T07:29:57.829Z",
"end_time_in_millis" : 1597044597829,
"duration_in_millis" : 3612,
"failures" : [ ],
"shards" : {
"total" : 4,
"failed" : 0,
"successful" : 4
}
}
}

备份成功后,可以通过 GET _snapshot/my_hdfs_repository/snap001_1/_status 查看备份信息信息,

  1. 恢复索引
    先删除索引:
    DELETE snap001-2020.08.09-000002
    再还原索引:

POST _snapshot/my_hdfs_repository/snap001_1/_restore
{
"indices": "snap001-2020.08.09-000002"
}
{
"accepted" : true
}

3 使用技巧

  1. 创建hdfs仓库错误
    创建hdfs仓库报错,错误内容如下:
    “reason”: “The short-circuit local reads feature is enabled but dfs.domain.socket.path is not set.”
    出现此处错误原因:hdfs没有配置 dfs.client.read.shortcircuit参数
    解决方法:
    1)在hdfs的xml中配置如下参数

<property>
<name>dfs.client.read.shortcircuit</name>
<value>true</value>
</property>

4 说明

​es 官文 snapshot-restore​

​elasticsearch源码解析与优化实战-Snapshot模块分析