架构图

  • 考虑到日志系统的可扩展性以及目前的资源(部分功能复用),整个ELK架构如下:

elk 集群 搭建 elk集群架构_elk 集群 搭建

架构解读 : (整个架构从左到右,总共分为5层)

第一层、数据采集层

最左边的是业务服务器集群,上面安装了filebeat做日志采集,同时把采集的日志分别发送给两个logstash服务(17.162、17.163)

第二层、数据处理层,数据缓存层

logstash服务把接受到的日志经过格式处理,转存到本地的kafka broker+zookeeper 集群中。

第三层、数据转发层

这个单独的Logstash(17.161)节点会实时去kafka broker集群拉数据,转发至ES DataNode。

第四层、数据持久化存储

ES DataNode 会把收到的数据,写磁盘,建索引库。

第五层、数据检索,数据展示

ES Master + Kibana 主要 协调 ES集群,处理数据检索请求,数据展示。

服务器资源以及软件版本
- 操作系统:centos6.5、虚拟机

  • 服务器角色安排

主机ip

部署服务

服务器配置

10.200.17.161

elastic、kafka、 logstash zookeeper

8核16g 1T

10.200.17.162

elastic、kafka、logstash、zookeeper

4核8g 1T

10.200.17.163

elastic、kafka、logstash、zookeeper

4核8g 1T

软件版本:
jdk1.8.0_45
elasticsearch-5.2.2
kafka_2.10-0.10.2.0
kafka-manager-1.3.3.5
kibana-5.2.2-linux-x86_64
logstash-5.2.2
zookeeper-3.4.9
kibana-5.2.2-linux-x86_64

安装部署

  • 系统优化
    cat /etc/sysctl.conf
    net.ipv4.tcp_max_syn_backlog = 4096
    net.core.netdev_max_backlog = 2048
    net.ipv4.tcp_fin_timeout = 15
    net.ipv4.tcp_tw_reuse = 1
    net.ipv4.tcp_tw_recycle = 1
    net.ipv4.tcp_syncookies = 1
    vm.max_map_count= 262144 #后期配置ES很关键
    vm.swappiness = 1
    cat /etc/security/limits.conf
    soft nproc 65535
    hard nproc 65535
    soft nofile 65536
    hard nofile 65536
  • 配置java环境
    cd /apps/svr
    tar zxvf jdk1.8.0_45.tar.gz
    ln -s jdk1.8.0_45 jdk
    vi /etc/profile #后面添加如下
    export JAVA_HOME=/apps/svr/jdk
    export PATH=$JAVA_HOME/bin:$PATH
    export CLASSPATH=.:\$JAVA_HOME/lib/dt.jar:\$JAVA_HOME/lib/tools.jar
    source /etc/profile
  • 用户问题
    为了方便这里所有的应用全部都在apps帐号下
    useradd apps && echo apps | passwd –stdin apps
  • python升级以及安装supervisor
•  cat update_python.sh 
#!/bin/bash 
 #creat by xiaojs 
 if [ whoami != ‘root’ ] 
 then 
 exit 1 
 fi
if [[ python -c "import platform ;print platform.python_version()" = 2.7.* ]] 
 then 
 echo ‘you need not do everything’ 
 exit 0 
 else 
 echo ‘============================’ 
 echo ‘=======start update========’
fi 
 # get the tar 
 cd /usr/local/src 
 wget http://xxxxx/python/Python-2.7.8.tgz 
 wget http://xxxxx/python/pyinotify.tar.gz 
 wget http://xxxxx/python/MySQL-python-1.2.4.zip## 
 yum -y install git gcc mysql mysql-devel
#install 
 tar zxvf Python-2.7.8.tgz 
 cd Python-2.7.8 
 ./configure –prefix=/usr/local/python2.7.8 
 make && make install
mv /usr/bin/python /usr/bin/python_old 
 ln -s /usr/local/python2.7.8/bin/python /usr/bin/
sed -i ‘s/python/python_old/1’ /usr/bin/yum
#intall the plugin
cd .. 
 tar zxvf pyinotify.tar.gz 
 cd pyinotify 
 python setup.py install 
 cd .. 
 unzip MySQL-python-1.2.4.zip 
 cd MySQL-python-1.2.4 
 python setup.py install
####install supervisor 
 cd /usr/local/src 
 wget –no-check-certificate https://bootstrap.pypa.io/ez_setup.py -O - | sudo python 
 wget http://pypi.python.org/packages/source/d/distribute/distribute-0.6.10.tar.gz 
 tar xf distribute-0.6.10.tar.gz 
 cd distribute-0.6.10 
 python setup.py install 
 easy_install supervisor
cd /usr/local/python2.7.8/bin/ 
 cp supervisord supervisorctl echo_supervisord_conf /usr/bin/ 
mkdir /etc/supervisor && cd /etc/supervisor 
 wget http://ops.bubugao-inc.com/python/supervisord.conf
  • 安装elasticsearch
    cd /apps/svr/
    tar zxvf elasticsearch-5.2.2.tar.gz
    ln -s elasticsearch-5.2.2 elasticsearch
    [root@17161 elasticsearch]# sed -n /^[^#]/p config/elasticsearch.yml
cluster.name: SuperApp
    node.name: Battlestar01
    network.host: 0.0.0.0
    http.port: 9200
    discovery.zen.ping.unicast.hosts: ["10.200.17.161:9300","10.200.17.162:9300","10.200.17.163:9300"]
    discovery.zen.minimum_master_nodes: 3
    bootstrap.system_call_filter: false
    bootstrap.memory_lock: false
    http.cors.enabled: true
    http.cors.allow-origin: "*"

另外两台类似,后续会安装x-pack,所以以前的head和bigdesk不用安装

  • zookeeper+kafka集群部署
• 
 #zookeeper 
 cd /apps/svr 
 tar zxvf zookeeper-3.4.9.tar.gz 
 ln -s zookeeper-3.4.9 zookeeper 
 mkdir -p /apps/dbdat/zookeeper 
 [root@17163 zookeeper]# sed -n ‘/^[^#]/p’ conf/zoo.cfg 
 tickTime=2000 
 initLimit=10 
 syncLimit=5 
 dataDir=/apps/dbdat/zookeeper 
 clientPort=2181 
 server.1=10.200.17.161:12888:13888 
 server.2=10.200.17.162:12888:13888 
 server.3=10.200.17.163:12888:13888
#三台服务器分别赋值 
 echo 1 > /apps/dbdat/zookeeper/myid 
 echo 2 > /apps/dbdat/zookeeper/myid 
 echo 3 > /apps/dbdat/zookeeper/myid
#启动并查看状态 
 /apps/svr/zookeeper/bin/zkServer.sh start 
 /apps/svr/zookeeper/bin/zkServer.sh status 
 [root@17163 zookeeper]# /apps/svr/zookeeper/bin/zkServer.sh status 
 ZooKeeper JMX enabled by default 
 Using config: /apps/svr/zookeeper/bin/../conf/zoo.cfg 
 Mode: follower 
 #以上信息就是没问题
#kafka集群
cd /apps/svr
tar zxvf kafka_2.10-0.10.2.0.tgz
ln -s kafka_2.10-0.10.2.0 kafka
[root@17161 src]# sed -n '/^[^#]/p' /apps/svr/kafka/config/server.properties
broker.id=1
delete.topic.enable=true
listeners=PLAINTEXT://10.200.17.161:9092
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/apps/logs/kafka-logs
num.partitions=1
num.recovery.threads.per.data.dir=1
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
zookeeper.connect=10.200.17.161:2181,10.200.17.162:2181,10.200.17.163:2181
zookeeper.connection.timeout.ms=6000

#不同的节点,注意broker.id和linsten的ip

\#启动查看是否正常
/apps/svr/kafka/bin/kafka-server-start.sh config/server.properties
\#有一些用得到的指令
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test  # 创建topic

bin/kafka-topics.sh --list --zookeeper localhost:2181   # 查看已经创建的topic列表

bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic test  # 查看topic的详细信息

bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test # 发送消息, 回车后模拟输入一下消息

bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test # 消费消息, 可以换到其他kafka节点, 同步接收生产节点发送的消息

bin/kafka-topics.sh --zookeeper localhost:2181 --alter --topic test --partitions 6  # 给topic增加分区

bin/kafka-topics.sh --delete --zookeeper localhost:2181 --topic test1  # 删除已经创建的topic, 前提是开了delete.topic.enable=true参数

如果还不能删除, 可以到zookeeper中去干掉它
cd /usr/local/zookeeper-3.4.10/
bin/zkCli.sh
ls /brokers/topics            # 查看topic
rm -rf /brokers/topics/test1     # 删除topic
  • logstash的部署和配置
    cd /apps/svr
    tar zxvf logstash-5.2.2.tar.gz
    ln -s logstash-5.2.2/ logstash
    #安装都一样,重点是两端配置文件不一样,一个是负责写入kafka,一个是负责从kafka提取出来写入elasticsearch,配置分别如下:
    [root@17162 ~]# cat /apps/conf/logstash/logstash-in-kafka.conf
input {
            beats {
            port => 5044
            }
    }

    output {
    if [type] == "nginx-accesslog" {
    kafka {
            bootstrap_servers => "10.200.17.161:9092,10.200.17.162:9092,10.200.17.163:9092"
            topic_id => "nginx-accesslog"
        }
    }

        if [type] == "tomcat-log" {
        kafka {
                bootstrap_servers => "10.200.17.161:9092,10.200.17.162:9092,10.200.17.163:9092"
                topic_id => "tomcat-log"
                }
        }

        if [type] == "sys-messages" {
        kafka {
                bootstrap_servers => "10.200.17.161:9092,10.200.17.162:9092,10.200.17.163:9092"
                topic_id => "sys-messages"
                }
        }

        if [type] == "siebel-middle" {
        kafka {
                bootstrap_servers => "10.200.17.161:9092,10.200.17.162:9092,10.200.17.163:9092"
                topic_id => "siebel-middle"
                }
        }

        if [type] == "siebel-eai" {
        kafka {
                bootstrap_servers => "10.200.17.161:9092,10.200.17.162:9092,10.200.17.163:9092"
                topic_id => "siebel-eai"
                }
        }

        if [type] == "siebel-application" {
        kafka {
                bootstrap_servers => "10.200.17.161:9092,10.200.17.162:9092,10.200.17.163:9092"
                topic_id => "siebel-application"
                }
        }

}
[apps@17161 ~]$ cat /apps/conf/logstash/logstash-kafka.conf
input {
kafka{
bootstrap_servers => "10.200.17.161:9092,10.200.17.162:9092,10.200.17.163:9092"
topics => "nginx-accesslog"
    consumer_threads => 50
    decorate_events => true
type => "nginx-accesslog"
}

    kafka{
    bootstrap_servers => "10.200.17.161:9092,10.200.17.162:9092,10.200.17.163:9092"
    topics => "sys-messages"
    consumer_threads => 50
    decorate_events => true
    type => "sys-messages"
    }

    kafka{
    bootstrap_servers => "10.200.17.161:9092,10.200.17.162:9092,10.200.17.163:9092"
    topics => "tomcat-log"
    consumer_threads => 50
    decorate_events => true
    type => "tomcat-log"
    }

    kafka{
    bootstrap_servers => "10.200.17.161:9092,10.200.17.162:9092,10.200.17.163:9092"
    topics => "siebel-middle"
    consumer_threads => 50
    decorate_events => true
    type => "siebel-middle"
    }

    kafka{
    bootstrap_servers => "10.200.17.161:9092,10.200.17.162:9092,10.200.17.163:9092"
    topics => "siebel-eai"
    consumer_threads => 50
    decorate_events => true
    type => "siebel-eai"
    }

    kafka{
    bootstrap_servers => "10.200.17.161:9092,10.200.17.162:9092,10.200.17.163:9092"
    topics => "siebel-application"
    consumer_threads => 50
    decorate_events => true
    type => "siebel-application"
    }

}

filter {
    if [type] == "nginx-accesslog" {
            grok {
                    match => ["message","%{IPORHOST:client_ip} %{USER:ident} %{USER:auth} \[%{HTTPDATE:timestamp}\] \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:http_version})?|%{DATA:rawrequest})\" (?:%{URIHOST:domain}|-) %{NUMBER:response} (?:%{NUMBER:bytes}|-) %{QS:referrer} %{QS:agent} %{QS:x_forword} %{QS:upstream_host} %{QS:upstream_response} (%{WORD:upstream_cache_status}|-) %{QS:upstream_content_type} %{QS:upstream_response_time} > (%{BASE16FLOAT:request_time}) \"(%{NGINXUID:uid}|-)\""]
            }
            date {
                    locale => "en_US"
                    match => ["timestamp", "dd/MMM/yyyy:HH:mm:ss Z"]
                    remove_field => [ "timestamp" ]
            }

}

if [type] == "tomcat-log" {
           grok {
        match => {"message" =>  "((app=(?<app>[^,]*)\,?))(\s*)((app0=(?<app0>[^,]*)\,?)?)(\s*)((app1=(?<app1>[^,]*)\,?)?)(.*\, host)(=(?<host>[^,]*)\,)(\s*)(pid=(?<pid>[^,]*)\,)(\s*)((t0=(?<t0>[^,]*)\,)?)(\s*)(trackId=(?<trackId>[a-zA-Z0-9]+)\})(\s*)(\[(?<time>[^]]*)\])(\s*)(\[(?<loglevel>DEBUG|INFO|WARN|ERROR)\])((.*\"time\":(?<apitime>\d+)\,\"code\":(?<apicode>\"[^\"]*\")\,\"msg\":(?<apimsg>\"[^\"]*)\"\})?)(.*\[Cost)?((\s+(?<Cost>\d+)ms\])?)"}
    }
}

if [type] == "siebel-middle" {
    grok {
        match => {"message" => "(\[(?<time>[^]]*)\])(\s*)(\[(?<loglevel>DEBUG|INFO|WARN|ERROR)\])(\s*)(\[(?<thread>[^]]*)\])(\s*)\--(\s*)(?<APItype>(\S*))(.*cost)?((\s+(?<Cost>\d+)(\s*)ms)?)"}
    }
}

if [type] == "siebel-eai" {
    grok {
        match =>{"message" => "(?<time>\d{4}\-\d{2}\-\d{2} \d{2}:\d{2}:\d{2})(\s*)(?<msg>.*)"}
    }
}

    if [type] == "siebel-application" {
            grok {
                    match =>{"message" => "(?<time>\d{4}\-\d{2}\-\d{2} \d{2}:\d{2}:\d{2})(\s*)(?<msg>.*)"}
            }
    }

mutate {
    #convert => {"Cost" => "integer"}
    convert => ["Cost","integer","request_time","integer","response","integer","upstream_response","integer"]
}
}
output {
    elasticsearch {
    hosts => ["10.200.17.161:9200","10.200.17.162:9200","10.200.17.163:9200"]
user => elastic
password => changeme
    index => "logstash-%{type}-%{+YYYY.MM.dd}"
    manage_template => true
    flush_size => 50000
    idle_flush_time => 10
   }
# stdout { 
 # codec => rubydebug 
 # }} 
 “`
  • 应用服务器的filebeat的配置
• cd /apps/svr 
 tar zxvf filebeat-5.2.2-linux-x86_64.tar.gz 
 ln -s filebeat-5.2.2-linux-x86_64 filebeat 
 [root@java1732 svr]# sed -n ‘/^[^#]/’p filebeat/filebeat.yml 
 filebeat.prospectors:• input_type: log 
 paths: • /apps/svr/server//logs/xxxxxx-app/.log 
 exclude_files: [“xx.log”] 
 document_type: tomcat-log 
 #multiline.pattern: ^\s 
 multiline.pattern: ^[^{app] 
 multiline.match: after• input_type: log 
 paths:• /var/log/messages 
 document_type: sys-messages 
 output.logstash: 
 # The Logstash hosts 
 hosts: [“10.200.17.162:5044”,”10.200.17.163:5044”]

#调试指令:./filebeat -e -c filebeat.yml -d “production”

  • kibane页面配置
•  cd /apps/svr 
 tar zxvf kibana-5.2.2-linux-x86_64.tar.gz 
 ln -s kibana-5.2.2-linux-x86_64 kibana 
 [root@17161 kibana]# sed -n ‘/^[^#]/’p config/kibana.yml 
 server.port: 5601 
 server.host: “10.200.17.161” 
 elasticsearch.url: “http://10.200.17.161:9200” 
 kibana.index: “.kibana”

#对应的nginx的配置如下

upstream kibana {
        keepalive      400;
           server  10.200.17.161:5601 max_fails=3  fail_timeout=30s;
}
server  {
    listen          80;
    server_name     10.200.17.161;

    if (-d $request_filename) {
        rewrite ^/(.*)([^/])$ http://$host/$1$2/ permanent;
    }

    location / {
        proxy_pass              http://kibana;
        proxy_http_version 1.1;
        proxy_set_header Connection "";
        proxy_set_header        X-Real-IP  $remote_addr;
        proxy_set_header        Host             $host;
        proxy_set_header        X-Forwarded-For  $proxy_add_x_forwarded_for;
    }
    error_log           logs/kinaba5.error.log;
    access_log          logs/kinaba5.access.log log_access;
}


#至此,整个框架已经完成,可以先建立kafka的topic测试,然后观察elasticsearch的索引是否建立成功,或简单的从页面观察即可

  • 插件和其他相关
    1、由于上述大部分应用都是跑在后台,有时候进程是否挂掉,不得而知,监控如果对于每个进程监控略显麻烦,而且不方便启动,所以这里用supervisor进行统一管理,上述已经有安装记录,具体的配置就不做展示了
    2、 x-pack的安装
    /apps/svr/kibana/bin/kibana-plugin install x-pack