Logstash

logstash filter调用python logstash filter详解_数据

用途:

 

logstash filter调用python logstash filter详解_JAVA_02

原理: 

logstash filter调用python logstash filter详解_JAVA_03

配置详解

Logstash的配置由三部分,如下

input { #输入

stdin { … } #标准输入

filter { #过滤,对数据进行分割、截取等处理

}

output { #输出

stdout { … } #标准输出

}

输入

  • 采集各种样式、大小和来源的数据,数据往往以各种各样的形式,或分散或集中地存在于很多系统中。
  • Logstash支持各种输入选择,可以在同一时间从众多常用来捕捉事件。能够以连续的流式传输方式,轻松地从您的日志、指标、WEB应用、数据存储以及各种AWS服务采集数据。

logstash filter调用python logstash filter详解_JAVA_04

过滤

  • 实时解析和转换数据
  • 数据从源传输到存储库的过程中,Logstash过滤器能够解析各个事件,识别已命名的字段以构建结构,并将它们转换成通用格式,以便更轻松、更快捷地分析和实现商业价值。

logstash filter调用python logstash filter详解_读取数据_05

输出

logstash提供众多输出选择,您可以将数据发送到您要指定的地方,并且能够灵活地解锁众多下游用例。

logstash filter调用python logstash filter详解_数据_06

读取自定义日志

前面我们通过Fliebeat读取了nginx的日志,如果是自定义结构的日志,就需要读取处理后才能使用,所以,这个时候就需要使用Logstash了,因为Logstash有着强大的处理能力,可以应对各种各样的场景。

日志结构

logstash filter调用python logstash filter详解_数据_07

可以看到,日志中的内容是使用“|”进行分割的,使用,我们在处理的时候,也需要对数据做分割处理。

#写日志到文件
echo  “2019-11-20  16:22:22|ERROR|读取数据出错|参数:id=1002”  >>  app.log

#输出的结果
{
      “@timestamp”=>  2019-06-15T08:44:04.749Z,
               “path” =>  “/itcast/logstash/logs/app.log”,
           “@version” =>  “1”,
                “host” =>  “node01”,
            “message” => [
             [0]  “2019-06-16   16:16:16”,
             [1]   “ERROR”,
             [2]   “读取数据出错”,
             [3]   “参数:id=1002”
      ]
}

可以看到,数据已经被分割了。

输出到Elasticsearch
input {
   file  {
     path =>  “/itcast/logstash/logs/app.log”
     #type =>  “system”
     start_position =>  “beginning”
}
}

filter {
   mutate {
      split => {“message” => “|”}
   }
}

output {
elasticsearch {
    host => [“192.168.210.xx:9200”,192.168.210.xx:9200”,”192.168.210.xx:9200”]
}
}

启动

./bin/logstash -f ./itcast-pipline.conf

#写入数据

echo “2019-11-20 16:16:16|ERROR|读取数据报错|参数: id=1003” >> app.log

测试:

logstash filter调用python logstash filter详解_数据_08

采集两个日志文本,并互相隔离模板

input {
    file {
       type => "log1"
       path => "/xxx/xxx/*.log"
       discover_interval => 10
       start_position => "beginning"
    }
    file {
      type => "log2"
      path => "/xxx/xxx/*.log"
      discover_interval => 10
      start_position => "beginning"
    }
    file {
      type => "log3"
      path => "/xxx/xxx/*.log"
      discover_interval => 10
      start_position => "beginning"
    }
    #beats{
           # port => "5045"
      # }
}


filter {
    if [type] == "log1" {
        mutate {
           split => {"message" => "|"}  # 分割日志
        }
        mutate {
            add_field => {
               "x1" => "%{[message][0]}"
               "x2" => "%{[message][1]}"
               "x3" => "%{[message][2]}"
            }

        }
        mutate {
            convert => {
               "x1" => "string"
               "x2" => "string"
               "x3" => "string"
            }
        }

        json {
            source => "xxx"
            target => "xxx"
        }
        mutate {
           remove_field => ["xxx","xxx","xxx","xxx"]  # 删除字段
        }
    }
    else if [type] == "log2" {
        mutate {
           split => {"message" => "|"}
        }

        mutate {
            add_field => {
               "x1" => "%{[message][0]}"
               "x2" => "%{[message][1]}"
               "x3" => "%{[message][2]}"

            }
        }
        mutate {
           convert => {
               "x1" => "string"
               "x2" => "string"
               "x3" => "string"
           }
        }
        json {
            source => "xxx"
            target => "xxx"
        }
        mutate {
           remove_field => ["xxx","xxx","xxx","xxx"]
        }
    }
}

output {
   if [type] == "log1" {
       elasticsearch {
           hosts => ["192.168.210.40:9200","192.168.210.44:9200","192.168.210.45:9200"]
           index => "log1-%{+YYYY-MM-dd}"
      }
   }
   else if [type] == "log2" {
       elasticsearch {
           hosts => ["192.168.210.40:9200","192.168.210.44:9200","192.168.210.45:9200"]
           index => "log2-%{+YYYY-MM-dd}"
      }
   }
}
#output {
#   stdout {codec => rubydebug}
#}

设置开机启动systemctl方式

[Unit]
Description=logstash daemon
After=syslog.target  network.target
Wants=network.target

[Service]
Type=simple
WorkingDirectory=/root/
ExecStart=/root/logstash-7.2.0/bin/logstash -f /root/logstash-7.2.0/logyml/test_json.yml
Restart= always
RestartSec=1min
User=root

[Install]
WantedBy=multi-user.target

启动会报错

could not find java; set JAVA_HOME or ensure java is in PATH

在logstash 以server启动的时候加入启动项JAVA_HOME

vi /etc/sysconfig/logstash

增加JAVA_HOME路径