ES 实战复杂sql查询、修改字段类型

精选原创

香吧香Blog 2023-07-23 00:10:52 ©著作权

文章标签 ElasticSearch 字段子查询字段类型 文章分类 办公效率

©著作权归作者所有：来自51CTO博客作者香吧香Blog的原创作品，请联系作者获取转载授权，否则将追究法律责任

1.查询索引得 mapping 与 setting

ES 实战复杂sql查询、修改字段类型_字段

　　get 直接查询索引名称时，会返回该索引得 mapping 和 settings 得配置，上述返回得结构如下：

ES 实战复杂sql查询、修改字段类型_ElasticSearch_02

ES 实战复杂sql查询、修改字段类型_ElasticSearch_03

{
  "terra-syslog_2023-07-12" : {
    "aliases" : { },
    "mappings" : {
      "properties" : {
        "@timestamp" : {
          "type" : "date"
        },
        "@version" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "host" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "message" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "received_at" : {
          "type" : "date"
        },
        "received_from" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "syslog_facility" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "syslog_facility_code" : {
          "type" : "long"
        },
        "syslog_hostname" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "syslog_message" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "syslog_program" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "syslog_severity" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "syslog_severity_code" : {
          "type" : "long"
        },
        "syslog_timestamp" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "tags" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "type" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "user" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        }
      }
    },
    "settings" : {
      "index" : {
        "creation_date" : "1689137630855",
        "number_of_shards" : "1",
        "number_of_replicas" : "1",
        "uuid" : "Qew4uoNUQ9q8-JQDPTWVPw",
        "version" : {
          "created" : "7080199"
        },
        "provided_name" : "terra-syslog_2023-07-12"
      }
    }
  }
}

View Code

2. 执行复杂条件得查询：

ES 实战复杂sql查询、修改字段类型_字段_04

　　该dsl 为：

GET terra-syslog_2023-07-15/_search

{
  "size": 0,
  "query": {
    "bool": {
      "must": [
        {
          "wildcard": {
            "syslog_program.keyword": {
              "wildcard": "*SSH_USER_LOGIN*",
              "boost": 1
            }
          }
        }
      ],
      "adjust_pure_negative": true,
      "boost": 1
    }
  },
  "aggregations": {
    "time_agg": {
      "date_histogram": {
        "field": "received_at",
        "format": "EEE",
        "fixed_interval": "1d",
        "offset": 0,
        "order": {
          "_key": "asc"
        },
        "keyed": false,
        "min_doc_count": 0
      },
      "aggregations": {
        "user_agg": {
          "terms": {
            "field": "user.keyword",
            "size": 10,
            "min_doc_count": 1,
            "shard_min_doc_count": 0,
            "show_term_doc_count_error": false,
            "order": [
              {
                "_count": "desc"
              },
              {
                "_key": "asc"
              }
            ]
          }
        }
      }
    }
  }
}

这段 DSL 具有以下作用：

"size": 0: 设置返回的结果集大小为 0，即只返回聚合结果，不返回匹配的文档。
query 部分：构建了一个布尔查询，包含多个 exists 和 range 子查询，用于过滤符合条件的文档。

exists 子查询检查指定字段是否存在，这里依次检查了 source.ip、source.port、destination.ip、destination.port、host.name 和 flow.rep_tags 字段的存在。
range 子查询指定了对 @timestamp 字段进行范围筛选，从给定的时间戳范围中选择满足条件的文档。

aggregations 部分：定义了聚合操作，通过 terms 聚合按照 host.name 字段进行分组，并计算每个组内的文档数。

terms 聚合将按照 host.name 字段的值进行分组。设置 size 为最大整数 2147483647，以确保返回所有分组。
min_doc_count 设置为 1，表示只返回至少拥有一个文档的分组。
shard_min_doc_count 设置为 0，表示在单个分片上没有要求文档数量的最小要求。
show_term_doc_count_error 设置为 false，不显示术语文档计数错误。
order 指定了排序规则，首先按照分组中的文档数 _count 降序排序，然后按照 host.name 字段的值升序排序。
在 terms 聚合内部定义了一个子聚合 cardinality，用于计算每个分组内唯一组合的数量。这里通过拼接 source.ip、source.port、destination.ip 和 destination.port 字段的值来作为唯一标识。

　　该 DSL 查询的作用是在给定时间范围内，统计满足一系列条件（存在指定字段）的文档，并按照 host.name 进行分组并计算每个组内唯一组合的数量。

　　另外，在查询时，使用 _search 可以执行DSL, 如果没有_search 时，可以查询该索引得文档结构类型，以及该索引得副本、分片等信息

3.修改该索引得 mapping 中得字段类型

　　将前面的映射中的 syslog_timestamp 字段类型修改为日期类型（date），需要更新映射定义并重新创建索引。　　

删除现有的索引，或者创建一个新的索引。
更新映射定义，将 syslog_timestamp 的类型更改为 "date"。以下是更新后的映射示例：

{
  "mappings": {
    "_doc": {
      "properties": {
        // 其他字段...
        "syslog_timestamp": {
          "type": "date"
        },
        // 其他字段...
      }
    }
  }
}

使用上述修改后的映射定义来创建索引或更新现有索引的映射。可以使用 Elasticsearch 的 RESTful API 或管理工具（如 Kibana Console）执行以下请求：

PUT terra-syslog_2023-07-15
{
  "mappings": {
    "_doc": {
      "properties": {
        // 其他字段...
        "syslog_timestamp": {
          "type": "date"
        },
        // 其他字段...
      }
    }
  }
}

　　这样，syslog_timestamp 字段的类型就会被修改为日期类型，并可以存储、索引和查询日期值。根据数据的格式和需求，Elasticsearch 会自动解析日期字符串并将其转换为适当的日期对象。