现状: 之前的日志系统,采用是两台ELK,每天日志量在120G-160G,机器负载在18%左右,日志处理延迟在30分钟左右,改造后,同样两台机器,机器负载高峰时在10%左右,处理零延迟,并且支持告警, 改造后日志系统结构图如下: Graylog stream内部结构:

安装包版本: redis 3.2.8 mongodb-3.4.4 logstash 5.2.2 graylog-2.3.2 elasticsearch-5.6.3 jdk版本要求1.8.0_144以上 一:redis安装 1.主机, A组:192.168.1.205(主)/192.168.1.204 B组:192.168.1.168(主)/192.168.1.167 2.安装主从搭建略,网上例子很多,内存设置大点,关闭内存淘汰机制即可 二:mongodb 1.主机 192.168.1.207(3000) 192.168.1.206(3000) 192.168.1.205(3000) 2.副本集搭建,很简单,网上搜索,graylog只存储一些配置信息,所以mongodb不要配置多大缓冲池 三:elasticsearch安装,由于graylog-2.3.2支持5.6.x以上版本 1,由于机器有限,这里只配置了两台elasticsearch,建议至少三台 192.168.1.205:9200 192.168.1.168:9200 主机192.168.1.205 elasticsearch配置:(以下作为参考) vim /application/elasticsearch/config/elasticsearch.yml cluster.name: log node.name: master-205 node.master: true node.data: true #node.tag: elk-master-205 path.conf: /application/elasticsearch path.data: /data/elasticsearch/elasticsearch_data #path.work: /data/elasticsearch/elasticsearch_tmp path.logs: /data/elasticsearch/elasticsearch_log node.max_local_storage_nodes: 1 #index.number_of_shards: 3 #index.number_of_replicas: 1

#bootstrap.mlockall: true network.bind_host: 192.168.1.205 network.host: ['192.168.1.205', '192.168.1.168'] network.publish_host: 192.168.1.205 http.port: 9200

gateway.recover_after_nodes: 1 bootstrap.system_call_filter: false gateway.recover_after_time: 10m #gateway.expected_nodes: 2 #discovery.zen.minimum_master_nodes: 1 #discovery.zen.ping.timeout: 30s discovery.zen.fd.ping_timeout: 30s discovery.zen.fd.ping_interval: 60s discovery.zen.fd.ping_retries: 6 discovery.zen.ping.unicast.hosts: ['192.168.1.205', '192.168.1.168']

#index.merge.scheduler.max_thread_count: 2 #index.translog.durability: async

#index.search.slowlog.threshold.query.warn: 10s #index.search.slowlog.threshold.query.info: 5s #index.search.slowlog.threshold.query.debug: 1s #index.search.slowlog.threshold.query.trace: 20ms

#index.search.slowlog.threshold.fetch.warn: 1s #index.search.slowlog.threshold.fetch.info: 800ms #index.search.slowlog.threshold.fetch.debug: 500ms #index.search.slowlog.threshold.fetch.trace: 20ms

#index.indexing.slowlog.threshold.index.warn: 10s #index.indexing.slowlog.threshold.index.info: 5s #index.indexing.slowlog.threshold.index.debug: 1s #index.indexing.slowlog.threshold.index.trace: 50ms

#monitor.jvm.gc.young.warn: 800ms #monitor.jvm.gc.young.info: 500ms #monitor.jvm.gc.young.debug: 20ms

#monitor.jvm.gc.old.warn: 5s #monitor.jvm.gc.old.info: 2s #monitor.jvm.gc.old.debug: 1s

http.cors.enabled: true http.cors.allow-origin: "*" #http.cors.allow-headers: Authorization http.max_content_length: 1024mb

#xpack.security.enabled: false #xpack.monitoring.enabled: false #xpack.graph.enabled: false #xpack.watcher.enabled: false

thread_pool.bulk.size: 9 thread_pool.bulk.queue_size: 1000 thread_pool.index.size: 9 thread_pool.index.queue_size: 1000

vim /application/elasticsearch/bin/elasticsearch 添加如下配置,最大不超过32g ES_JAVA_OPTS="-Xms18g -Xmx18g"

主机192.168.1.168 elasticsearch配置:(以下作为参考) cluster.name: log node.name: slave-168 node.master: true node.data: true path.conf: /application/elasticsearch path.data: /data/elasticsearch/elasticsearch_data #path.work: /data/elasticsearch/elasticsearch_tmp path.logs: /data/elasticsearch/elasticsearch_log node.max_local_storage_nodes: 1 #index.number_of_shards: 3 #index.number_of_replicas: 1

#bootstrap.mlockall: true network.bind_host: 192.168.1.168 network.host: ['192.168.1.205', '192.168.1.168'] network.publish_host: 192.168.1.168 http.port: 9200

gateway.recover_after_nodes: 1 bootstrap.system_call_filter: false gateway.recover_after_time: 10m #gateway.expected_nodes: 2 #discovery.zen.minimum_master_nodes: 1 #discovery.zen.ping.timeout: 30s discovery.zen.fd.ping_timeout: 30s discovery.zen.fd.ping_interval: 60s discovery.zen.fd.ping_retries: 6 discovery.zen.ping.unicast.hosts: ['192.168.1.205', '192.168.1.168']

#index.merge.scheduler.max_thread_count: 2 #index.translog.durability: async

#index.search.slowlog.threshold.query.warn: 10s #index.search.slowlog.threshold.query.info: 5s #index.search.slowlog.threshold.query.debug: 1s #index.search.slowlog.threshold.query.trace: 20ms

#index.search.slowlog.threshold.fetch.warn: 1s #index.search.slowlog.threshold.fetch.info: 800ms #index.search.slowlog.threshold.fetch.debug: 500ms #index.search.slowlog.threshold.fetch.trace: 20ms

#index.indexing.slowlog.threshold.index.warn: 10s #index.indexing.slowlog.threshold.index.info: 5s #index.indexing.slowlog.threshold.index.debug: 1s #index.indexing.slowlog.threshold.index.trace: 50ms

#monitor.jvm.gc.young.warn: 800ms #monitor.jvm.gc.young.info: 500ms #monitor.jvm.gc.young.debug: 20ms

#monitor.jvm.gc.old.warn: 5s #monitor.jvm.gc.old.info: 2s #monitor.jvm.gc.old.debug: 1s

http.cors.enabled: true http.cors.allow-origin: "*" #http.cors.allow-headers: Authorization http.max_content_length: 1024mb

#xpack.security.enabled: false #xpack.monitoring.enabled: false #xpack.graph.enabled: false #xpack.watcher.enabled: false

thread_pool.bulk.size: 9 thread_pool.bulk.queue_size: 1000 thread_pool.index.size: 9 thread_pool.index.queue_size: 1000

vim /application/elasticsearch/bin/elasticsearch 添加如下配置,最大不超过32g ES_JAVA_OPTS="-Xms18g -Xmx18g"

四:Graylog安装 1.到官网上下载tar包,解压到指定目录,受机器限制这里只配置两个节点 2.192.168.1.205需要配置如下: is_master = true node_id_file = /etc/graylog/server/node-id #=pwgen -N 1 -s 96|yum install pwgen password_secret=E8yBY19BBOts0rN9Djy6NhGfFarXNbsjlsHnZQZS3rDalV8OpRs4gyWkl2MQsRj2ctGOGZi2G6s1c2y2V8TNyeyZH4eiv2B3 root_username = admin #echo -n yourpassword | shasum -a 256或echo -n yourpassword | sha256sum root_password_sha2 =8c6976e5b5410415bde908bd4dee15dfb167a9c873fc4bb8a81f6f2ab448a918 root_timezone =Asia/Shanghai rest_listen_uri = http://192.168.1.205:19000/api/ web_listen_uri = http://192.168.1.205:29000/ elasticsearch_hosts = http://192.168.1.205:9200,http://192.168.1.168:9200 elasticsearch_index_prefix = graylog stale_master_timeout = 100000 mongodb_uri = mongodb://192.168.1.205:3000,192.168.1.206:3000,192.168.1.207:3000/graylog 3.192.168.1.168需要配置如下: is_master = false node_id_file = /etc/graylog/server/node-id #=pwgen -N 1 -s 96|yum install pwgen password_secret=E8yBY19BBOts0rN9Djy6NhGfFarXNbsjlsHnZQZS3rDalV8OpRs4gyWkl2MQsRj2ctGOGZi2G6s1c2y2V8TNyeyZH4eiv2B3 #echo -n yourpassword | shasum -a 256或echo -n yourpassword | sha256sum root_username = admin #此处设置登录密码 root_password_sha2 =8c6976e5b5410415bde908bd4dee15dfb167a9c873fc4bb8a81f6f2ab448a918 root_timezone =Asia/Shanghai rest_listen_uri = http://192.168.1.168:19000/api/ web_listen_uri = http://192.168.1.168:29000/ elasticsearch_hosts = http://192.168.1.205:9200,http://192.168.1.168:9200 elasticsearch_index_prefix = graylog stale_master_timeout = 100000 mongodb_uri = mongodb://192.168.1.205:3000,192.168.1.206:3000,192.168.1.207:3000/graylog 根据自己环境调整 vim /application/graylog-2.3.2/bin/graylogctl take variables from environment if set GRAYLOGCTL_DIR=${GRAYLOGCTL_DIR:=$(dirname "$GRAYLOGCTL")} GRAYLOG_SERVER_JAR=${GRAYLOG_SERVER_JAR:=graylog.jar} GRAYLOG_CONF=${GRAYLOG_CONF:=/etc/graylog/server/server.conf} GRAYLOG_PID=${GRAYLOG_PID:=/tmp/graylog.pid} LOG_FILE=${LOG_FILE:=log/graylog-server.log} LOG4J=${LOG4J:=} DEFAULT_JAVA_OPTS="-Djava.library.path=${GRAYLOGCTL_DIR}/../lib/sigar -Xms3g -Xmx3g -XX:NewRatio=1 -server -XX:+ResizeTLAB -XX:+UseConcMarkSweepGC -XX:+CMSConcurrentMTEnabled -XX:+CMSClassUnloadingEnabled -XX:+UseParNewGC -XX:-OmitStackTraceInFastThrow" JAVA_OPTS="${JAVA_OPTS:="$DEFAULT_JAVA_OPTS"}" 4.启动 /application/graylog-2.3.2/bin/graylogctl start 5.启动成功打开http://192.168.1.205:29000 可以查看节点状态,配置多少个节点就显示多少节点状态 五:配置实例 1.采集日志到redis中(启动:logstash -f /application/logstash/config/ads.conf) .vim /application/logstash/config/ads.conf input { file { path => "/home/ads/logs/product/common-error.log" type => "ads-error" codec => multiline { pattern => "^%{TIMESTAMP_ISO8601} " negate => true what => previous charset => "GBK" } } file { path => "/home/ad/logs////accesslog_.log" type => "ads-access" } }

filter { mutate { add_field => { server => "192.168.1.121"} } }

output {

if [type] == "ads-error" { redis { host => ["192.168.1.205", "192.168.1.168"] data_type => "list" shuffle_hosts => true key => "common-error" } } else if [type] == "ads-access" { redis { host => ["192.168.1.205", "192.168.1.168"] data_type => "list" shuffle_hosts => true key => "ads-access" } } } 2.从redis中取出日志至graylog中 安装gelf模块 /application/logstash/bin/logstash-plugin install logstash-output-gelf 下载:GeoLite2-City.mmdb vim /application/logstash/config/ads.conf input { redis { host => "192.168.1.205" data_type => "list" key => "ads-access" threads => 2 } redis { host => "192.168.1.168" data_type => "list" key => "ads-access" threads => 2 } }

filter { mutate { gsub => ["message", "\x", "\\x"] } grok { match => { "message" => "%{COMMONNGINX}" } }

if [x_forwarded_for] =~ ',' { split { field => "x_forwarded_for" terminator => [","] } }

if [x_forwarded_for] =~ ' '{ mutate { gsub => ["x_forwarded_for", " ", ""] } } if [x_forwarded_for] { grok { match => ["x_forwarded_for", "%{IP:GEOIP}"] } } if [request] =~ 'checkstatus.jsp' { drop {} } if [tags] =~ 'grokparsefailure' { drop {} }

grok { match => { "referer" => ["%{URIPROTO}://%{URIHOST:referer_domain}"] } } date { match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ] } mutate { convert => { "bytes" => "integer" } } mutate { convert => { "response_time" => "integer" } } mutate { convert => { "response_body_length" => "integer" } } if [GEOIP] and [GEOIP] != "127.0.0.1" { geoip { source => "GEOIP" target => "geoip" database => "/application/logstash/data/GeoLite2-City.mmdb" add_field => [ "[geoip][coordinates]", "%{[geoip][longitude]}" ] add_field => [ "[geoip][coordinates]", "%{[geoip][latitude]}" ] } mutate { convert => [ "[geoip][coordinates]", "float"] } } } output { if [GEOIP] { gelf { host => "192.168.1.205"#如果graylog多节点机器配置了负载均衡,写负载均衡IP port => 12201

} } } 3.graylog配置 就可以收集到日志了, 六:邮件告警 vim /etc/graylog/server/server.conf transport_email_enabled =true transport_email_hostname =smtp.xxxx.com transport_email_port =25 transport_email_use_auth = true transport_email_use_tls = false transport_email_use_ssl = false transport_email_auth_username =xxxxxx transport_email_auth_password =xxxxxx transport_email_subject_prefix = [graylog] transport_email_from_email =xxxxx@xxxxx.com transport_email_web_interface_url = http://192.168.1.205:29000 七:配置数据流,过滤相应的数据到指定的indices 默认情况收集到数据都存储在elasticsearch graylog索引集中(如下图) 按工程创建相应的索引集合 创建stream 创建rules