简介

数据库日志收集分析系统主要有收集,过滤清洗,存储和分析模块组成,日志收集有Mysql,Postgresql,Mongodb,Redis等数据库的慢查询、错误日志,运行日志,将不规整的日志进行实时清洗,过滤后存入ES,从而实现数据库慢SQL的实时告警、报表、优化。兼容IDC机房自建数据库和云厂商数据库

架构

数据库之日志收集分析系统_数据库

原理

日志收集

IDC机房

数据库在部署时会自动部署filebeat,自动将实例的日志生成filebeat配置文件,从而实现日志的自动化收集

if r_list:
for r in r_list:
print UseStyle('将对%s进行自动化配置filebeat.........'%(r), back = 'black')
mysql_instance={}
try:
with open(r,'r+') as rf:

for f in rf.readlines():
if f.find("slow-query-log-file") >-1:
mysql_instance['slow_log_file'] = f.split("=")[1].strip()
mysql_instance['port'] = int(f.split("=")[1].strip().split("/")[-4].split("mysql")[1])
if f.find("log-error") >-1 :
mysql_instance['error_log_file'] = f.split("=")[1].strip()
mysql_instance['db_title'] = get_title(db_instance="%s:%s"%(ip,mysql_instance['port']))
f_list.append(mysql_instance)

except:
print UseStyle( "对%s读取mysql配置失败%s"%(r,str(traceback.format_exc())), back = 'red')

else:
print UseStyle( "没有找到mysql配置信息", back = 'green')
f_list = [dict(t) for t in set([tuple(d.items()) for d in f_list ])]
print f_list
if os.path.isfile(FilebeatYml):
os.rename(FilebeatYml, FilebeatYmlBak)
with open(FilebeatYml,"a+") as fd:
fd.write("filebeat.prospectors:")
for f in f_list:
line="""
-
input_type: log
document_type: mysqlslowlog
paths:
- %s
fields:
db_title: %s
port: %s
fields_under_root: true
multiline.pattern: "^# User@Host: "
multiline.negate: true
multiline.match: after
-
input_type: log
document_type: mysqlerrorlog
paths:
- %s
fields:
db_title: %s
port: %s
fields_under_root: true
multiline.pattern: "^[[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}"
multiline.negate: true
multiline.match: after
"""%(f['slow_log_file'],f['db_title'],f['port'],f['error_log_file'],f['db_title'],f['port'])
fd.write(line)
endline="""
#-
# input_type: log
# document_type: syslog
# paths:
# - /var/log/messages
# include_lines: ["err", "warn"]
output.logstash:
# The Logstash hosts
hosts: ["%s:5044"]

"""%(loghost )
fd.write(endline)

云厂商

通过定时从云厂商API拉去相关数据库的日志,为了保证日志不重复,每次拉取日志时,会写入一个redis锁

日志过滤清洗

Logstash过滤

将不同数据库类型的日志打到logstash后,logstash会进行过滤,丢弃一些无用日志

if [type] == "mysqlslowlog" {
grok {
match => { "message" => "SELECT SLEEP" }
add_tag => [ "sleep_drop" ]
tag_on_failure => [] # prevent default _grokparsefailure tag on real records
}
if "sleep_drop" in [tags] {
drop {}
}
grok {
match => { "message" => "(?m)^#\s+User@Host:\s+%{USER:user}\[[^\]]+\]\s+@\s+(?:(?<clienthost>\S*) )?\[(?:%{IPV4:clientip})?\]\s+Id:\s+%{NUMBER:row_id:int}\n#\s+Query_time:\s+%{NUMBER:query_time:float}\s+Lock_time:\s+%{NUMBER:lock_time:float}\s+Rows_sent:\s+%{NUMBER:rows_sent:int}\s+Rows_examined:\s+%{NUMBER:rows_examined:int}\n\s*(?:use %{DATA:database};\s*\n)?SET\s+timestamp=%{NUMBER:timestamp};\n\s*(?<sql>(?<action>\w+)\b.*;)\s*(?:\n#\s+Time)?.*$" }
}
date {
match => [ "timestamp", "UNIX","YYYY-MM-dd HH:mm:ss" ]
target => "exec_time"
add_field => [ "sql_finger", "%{sql}" ]
}
mutate {
remove_field => [ "message","beat","@version","source","input_log"]

}
}

清洗

日志经过过滤后,以http协议将日志打到日志网关,日志网关主要有如下功能

  • 清洗

对sql语句进行去指纹化,模糊化处理

原SQL

select max(exec_start_time) as exec_start_time from test_history where _id=32 and r_id=8487556

处理后SQL

select max(exec_start_time) as exec_start_time from _history where p_id=? and r_id=?
  • 数据存储

数据清洗后,将相关的数据打入Kafka,有Kafka后端模块,完成数据在ES的存储

日志分析

目前利用日志做到了如下功能

告警

基于用户维度,定时从es中获取达到告警阈值的数据,告警项主要有执行耗时和错误类型,并以邮件的形式报告出来

报表

  • 部门维度

每天通过账号的维度,拉取对应部门的慢查询日志,并以邮件的形式发送

  • 报表展示

​通过从ES中获取相关的日志数据,以报表的形式显示每日SQL总数,慢插叙次数,慢查询耗时进行展示,并以查询耗时SQL,查询次数,锁市场,扫描行数等维度进行展示

数据库之日志收集分析系统_sql分析 慢查询_02