参考文档:
filebeat正则:https://www.elastic.co/guide/en/beats/filebeat/8.2/regexp-support.html
正则测试:https://www.lddgo.net/string/golangregex
processors处理器:https://www.elastic.co/guide/en/beats/filebeat/8.2/defining-processors.html
dissect切词:https://www.elastic.co/guide/en/beats/filebeat/8.2/dissect.html
采集架构:
Filebeat-->Kafka<--logstash-->ES
采集及切词:filebeat
切词字段设计:

字段名

用途

mailid

邮件标识符,可通过标识符,查询本次发信发件人发给了哪些收件人,这是Postfix内部对特定邮件的一个唯一标识符,用于追踪邮件在整个处理过程中的状态。

senderip

发信客户端IP

receiver

收件人

relay

下游邮件中继地址

relayip

下游邮件中继地址解析后的IP

relayport

下游邮件中继地址的端口

status

信件转发状态,详细见本文附录部分解释

sender_mail

发件地址

nrcpt

收件人数量

filebeat版本:
$rpm -q filebeat
filebeat-8.2.3-1.x86_64
采集示例:
发件人信息采集前:
Dec 30 20:19:36 zabbixtest160130 postfix/smtpd[9184]: F19F810062B4: client=unknown[10.1.16.88]
处理后:
{"@timestamp":"2023-12-30T12:07:42.854Z","@metadata":{"beat":"filebeat","type":"_doc","version":"8.2.3"},"log":{"file":{"path":"/var/log/maillog"},"offset":30736},"mailid":"3368910062B4","host":{"name":"zabbixtest160130"},"message":"Dec 30 20:07:37 zabbixtest160130 postfix/smtpd[8494]: 3368910062B4: client=unknown[10.1.16.88]","senderip":"10.1.16.88","env":"test","type":"smtp","host_ip":"10.1.160.130","idc":"gds"}

收件人信息采集:
Dec 30 20:19:36 zabbixtest160130 postfix/smtp[9188]: 3EAF110062B4: to=<yexinlei@xinye.com>, relay=mail-gateway.xinye.com[101.230.255.51]:25, delay=0.22, delays=0/0/0.04/0.17, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 52080DD5)
处理后:
{"@timestamp":"2023-12-30T12:07:42.854Z","@metadata":{"beat":"filebeat","type":"_doc","version":"8.2.3"},"host":{"name":"zabbixtest160130"},"log":{"offset":31490,"file":{"path":"/var/log/maillog"}},"type":"smtp","idc":"gds","relayip":"101.230.255.52","message":"Dec 30 20:07:37 zabbixtest160130 postfix/smtp[8499]: 3368910062B4: to=<yexinlei@xinye.com>, relay=mail-gateway.xinye.com[101.230.255.52]:25, delay=0.17, delays=0.01/0.01/0.02/0.12, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 43700DCD)","mailid":"3368910062B4","receiver":"yexinlei@xinye.com","port":"25","host_ip":"10.1.160.130","status":"sent","env":"test","relay":"mail-gateway.xinye.com"}
filebeat配置:
max_procs: 1
queue.mem.events: 2048
queue.mem.flush.min_events: 1536
queue.mem.flush.timeout: 1s
filebeat.inputs:
- type: filestream
  enabled: true
  message_max_bytes: 20480
  id: smtp-filestream-id
  paths:
     - /var/log/maillog
  fields:
    host_ip: 10.1.160.130    ## 定义为本机IP
    idc: gds                 ## 定义为机房编号
    env: test                ## 定义为环境
    type: smtp               ## 定义为类型
  fields_under_root: true


processors:
  - if:
      regexp:
        message: "(.*to=.*)"
    then:
      - drop_fields:
          fields: ["agent", "ecs", "input"]
      - dissect:
          tokenizer: "%{date_hostname} postfix/%{process}[%{processid}]: %{mailid}: to=<%{receiver}>, relay=%{relay}[%{relayip}]:%{relayport}, %{sendinfo1}, status=%{status} %{sendinfo2}"
          field: "message"
          target_prefix: ""
      - drop_fields:
          fields: ["date_hostname", "process", "processid", "sendinfo1", "sendinfo2"]
    else:
      - if:
          regexp:
            message: "(.*client=unknown.*)"
        then:
          - drop_fields:
              fields: ["agent", "ecs", "input"]
          - dissect:
              tokenizer: "%{date_hostname} postfix/%{process}[%{processid}]: %{mailid}: client=unknown[%{senderip}]"
              field: "message"
              target_prefix: ""
          - drop_fields:
              fields: ["date_hostname", "process", "processid"]
        else:
          - if:
              regexp:
                message: "(.*from=.*)"
            then:
              - drop_fields:
                  fields: ["agent", "ecs", "input"]
              - dissect:
                  tokenizer: "%{date_hostname} postfix/%{process}[%{processid}]: %{mailid}: from=<%{sender_mail}>, %{sendinfo1}, nrcpt=%{nrcpt} %{sendinfo2}"
                  field: "message"
                  target_prefix: ""
              - drop_fields:
                  fields: ["date_hostname", "process", "processid", "sendinfo1", "sendinfo2"]
            else:
              - drop_fields:
                  fields: ["agent", "ecs", "input"]


output.kafka:
  enabled: true
  hosts: ["gdssyslogkfk01.xxx.com:9092", "gdssyslogkfk02.xxx.com:9092", "gdssyslogkfk03.xxx.com:9092"]
  topic: filelog
  required_acks: 1
  compression: gzip
  max_message_bytes: 1000000

#output.file:
#  path: "/tmp/filebeat"
#  filename: filebeat
配置检测:
$filebeat test config -c /etc/filebeat/filebeat.yml
Config OK


附录1 SMTP发信结果状态说明

在Postfix邮件服务器的SMTP会话日志中,status字段用于表示邮件处理的状态。虽然具体的值和含义可能根据上下文有所不同,但以下是一些常见的SMTP状态代码及其一般意义:

  1. status=sent (250 X.X.X Ok)
  • 250是最常见的成功发送回应码,表明邮件已被接收方服务器接受并已投递到邮件队列(或者对于某些SMTP扩展,直接投递给用户)。X.X.X是DSN(Delivery Status Notification)编码,通常情况下是"2.0.0"表示最终成功。
  1. status=queued
  • 表示邮件已被本地Postfix实例接收并放入队列等待进一步处理或传输。
  1. status=bounced (550 X.X.X User unknown)
  • 错误代码550通常意味着邮件无法送达,原因可能是收件人邮箱不存在(未知用户)或者其他错误。
  1. status=temporary_failure (4.X.X)
  • 开头为4的错误代码表示临时性错误,例如网络问题、邮件服务器暂时不可达等。这类错误通常会触发重新投递尝试。
  1. status=permanent_failure (5.X.X)
  • 开头为5的错误代码表示永久性错误,如地址无效、邮件被拒绝接收、超过存储限制等。这些错误通常会导致邮件无法投递,并可能向发件人发送退信通知。
  1. 其他状态值:
  • 还有其他各种状态值,比如deferred表示邮件延迟投递,active表示邮件正在处理中等等。

每种状态后跟随的具体数字和文本描述提供了更详细的错误信息。在您的日志片段中,status=sent (250 2.0.0 Ok: queued as 52080DD5)表示邮件已经被远程邮件服务器接收,并且在远程服务器上被分配了一个新的队列ID准备进行后续投递操作。