Logstash
用途:
原理:
配置详解
Logstash的配置由三部分,如下
input { #输入
stdin { … } #标准输入
}
filter { #过滤,对数据进行分割、截取等处理
…
}
output { #输出
stdout { … } #标准输出
}
输入
- 采集各种样式、大小和来源的数据,数据往往以各种各样的形式,或分散或集中地存在于很多系统中。
- Logstash支持各种输入选择,可以在同一时间从众多常用来捕捉事件。能够以连续的流式传输方式,轻松地从您的日志、指标、WEB应用、数据存储以及各种AWS服务采集数据。
过滤
- 实时解析和转换数据
- 数据从源传输到存储库的过程中,Logstash过滤器能够解析各个事件,识别已命名的字段以构建结构,并将它们转换成通用格式,以便更轻松、更快捷地分析和实现商业价值。
输出
logstash提供众多输出选择,您可以将数据发送到您要指定的地方,并且能够灵活地解锁众多下游用例。
读取自定义日志
前面我们通过Fliebeat读取了nginx的日志,如果是自定义结构的日志,就需要读取处理后才能使用,所以,这个时候就需要使用Logstash了,因为Logstash有着强大的处理能力,可以应对各种各样的场景。
日志结构
可以看到,日志中的内容是使用“|”进行分割的,使用,我们在处理的时候,也需要对数据做分割处理。
#写日志到文件
echo “2019-11-20 16:22:22|ERROR|读取数据出错|参数:id=1002” >> app.log
#输出的结果
{
“@timestamp”=> 2019-06-15T08:44:04.749Z,
“path” => “/itcast/logstash/logs/app.log”,
“@version” => “1”,
“host” => “node01”,
“message” => [
[0] “2019-06-16 16:16:16”,
[1] “ERROR”,
[2] “读取数据出错”,
[3] “参数:id=1002”
]
}
可以看到,数据已经被分割了。
输出到Elasticsearch
input {
file {
path => “/itcast/logstash/logs/app.log”
#type => “system”
start_position => “beginning”
}
}
filter {
mutate {
split => {“message” => “|”}
}
}
output {
elasticsearch {
host => [“192.168.210.xx:9200”,192.168.210.xx:9200”,”192.168.210.xx:9200”]
}
}
启动
./bin/logstash -f ./itcast-pipline.conf
#写入数据
echo “2019-11-20 16:16:16|ERROR|读取数据报错|参数: id=1003” >> app.log
测试:
采集两个日志文本,并互相隔离模板
input {
file {
type => "log1"
path => "/xxx/xxx/*.log"
discover_interval => 10
start_position => "beginning"
}
file {
type => "log2"
path => "/xxx/xxx/*.log"
discover_interval => 10
start_position => "beginning"
}
file {
type => "log3"
path => "/xxx/xxx/*.log"
discover_interval => 10
start_position => "beginning"
}
#beats{
# port => "5045"
# }
}
filter {
if [type] == "log1" {
mutate {
split => {"message" => "|"} # 分割日志
}
mutate {
add_field => {
"x1" => "%{[message][0]}"
"x2" => "%{[message][1]}"
"x3" => "%{[message][2]}"
}
}
mutate {
convert => {
"x1" => "string"
"x2" => "string"
"x3" => "string"
}
}
json {
source => "xxx"
target => "xxx"
}
mutate {
remove_field => ["xxx","xxx","xxx","xxx"] # 删除字段
}
}
else if [type] == "log2" {
mutate {
split => {"message" => "|"}
}
mutate {
add_field => {
"x1" => "%{[message][0]}"
"x2" => "%{[message][1]}"
"x3" => "%{[message][2]}"
}
}
mutate {
convert => {
"x1" => "string"
"x2" => "string"
"x3" => "string"
}
}
json {
source => "xxx"
target => "xxx"
}
mutate {
remove_field => ["xxx","xxx","xxx","xxx"]
}
}
}
output {
if [type] == "log1" {
elasticsearch {
hosts => ["192.168.210.40:9200","192.168.210.44:9200","192.168.210.45:9200"]
index => "log1-%{+YYYY-MM-dd}"
}
}
else if [type] == "log2" {
elasticsearch {
hosts => ["192.168.210.40:9200","192.168.210.44:9200","192.168.210.45:9200"]
index => "log2-%{+YYYY-MM-dd}"
}
}
}
#output {
# stdout {codec => rubydebug}
#}
设置开机启动systemctl方式
[Unit]
Description=logstash daemon
After=syslog.target network.target
Wants=network.target
[Service]
Type=simple
WorkingDirectory=/root/
ExecStart=/root/logstash-7.2.0/bin/logstash -f /root/logstash-7.2.0/logyml/test_json.yml
Restart= always
RestartSec=1min
User=root
[Install]
WantedBy=multi-user.target
启动会报错
could not find java; set JAVA_HOME or ensure java is in PATH
在logstash 以server启动的时候加入启动项JAVA_HOME
vi /etc/sysconfig/logstash
增加JAVA_HOME路径