Ingest pipelines

允许对数据在索引之前进行转换,例如过滤、转换字段等。

前置要求:

集群中至少有1个以上节点的角色是 ingest ,在 Elasticsearch Node节点角色 章节中我们说到,如果ingest的工作量大,建议使用专用ingest节点。

如果ES开启了安全管理,需要有 manage_pipeline  权限。

 

创建pipeline 支持的processors参考 Processor reference

PUT _ingest/pipeline/my-pipeline
{
  "description": "My optional pipeline description",
  "processors": [
    {
      "set": {
        "description": "My optional processor description",
        "field": "my-long-field",
        "value": 10
      }
    },
    {
      "set": {
        "description": "Set 'my-boolean-field' to true",
        "field": "my-boolean-field",
        "value": true
      }
    },
    {
      "lowercase": {
        "field": "my-keyword-field"
      }
    }
  ]
}

测试pipeline

POST _ingest/pipeline/my-pipeline/_simulate
{
  "docs": [
    {
      "_source": {
        "my-keyword-field": "FOO"
      }
    },
    {
      "_source": {
        "my-keyword-field": "BAR"
      }
    }
  ]
}

POST _ingest/pipeline/_simulate
{
  "pipeline": {
    "processors": [
      {
        "lowercase": {
          "field": "my-keyword-field"
        }
      }
    ]
  },
  "docs": [
    {
      "_source": {
        "my-keyword-field": "FOO"
      }
    },
    {
      "_source": {
        "my-keyword-field": "BAR"
      }
    }
  ]
}

使用pipeline

使用参数方式

POST my-data-stream/_doc?pipeline=my-pipeline
PUT my-data-stream/_bulk?pipeline=my-pipeline
POST my-data-stream/_update_by_query?pipeline=my-pipeline

对reindex

POST _reindex
{
  "source": {
    "index": "my-data-stream"
  },
  "dest": {
    "index": "my-new-data-stream",
    "op_type": "create",
    "pipeline": "my-pipeline"
  }
}

对index settings或index template settings 使用  index.default_pipeline ,当没有pipeline参数时起作用。

"settings": {
  "index": {
    "default_pipeline": "sw_segment_pipeline"
  }
}

对index settings或index template settings 使用 index.final_pipeline ,当没有pipeline参数和default_pipeline时起作用。