Kafka概述
和消息系统类似
消息中间件:生产者和消费者
妈妈:生产者
你:消费者
馒头:数据流、消息
正常情况下: 生产一个 消费一个
其他情况:
一直生产,你吃到某一个馒头时,你卡主(机器故障), 馒头就丢失了
一直生产,做馒头速度快,你吃来不及,馒头也就丢失了
拿个碗/篮子,馒头做好以后先放到篮子里,你要吃的时候去篮子里面取出来吃
篮子/框: Kafka
当篮子满了,馒头就装不下了,咋办?
多准备几个篮子 === Kafka的扩容
Kafka架构
producer:生产者,就是生产馒头(老妈)
consumer:消费者,就是吃馒头的(你)
broker:篮子
topic:主题,给馒头带一个标签,topica的馒头是给你吃的,topicb的馒头是给你弟弟吃
单节点单broker的部署及使用

注意先下载zookeeper并配置环境变量和zookeeper/conf/zoo_sample.cfg
tar -zxvf zookeeper-3.4.5-cdh5.7.0.tar.gz -C ~/app/
cd app
pwd
vi ~/.bash/profile
export ZK_HOME=/home/app/zookeeper-3.4.5-cdh5.7.0
export PATH=$ZK_HOME/bin:$PATH
source ~/.bash/profile
cd conf
cp zoo_sample.cfg zoo.cfg
vi zoo.cfg
修改dataDir中的路径,默认是tmp目录下,注意在每次虚拟机启动时,tmp目录会被清空的
cd app/
mkdir tmp -->cd tmp --> mkdir zk
dataDir=/home/hadoop/app/tmp/zk
cd bin ---> ./zkServer.sh start (启动zookeeper)
jps查看zookeeper是否启动成功
只要有QuorumPeerMain
也可连上zkClient ,
./zkCli.sh
ls /zookeeper/quota--->显示[ ]
-------------------------------
wget 下载0.9.0.0/kafka_2.11-0.9.0.0的版本(注意kafka是scala写的,所以他指定了scala版本)
解压到 app目录下面去 -C ~/app/
将kafka的路径配置到环境变量中
vim ~/.bash/profile
export KAFKA_HOME=/home/hadoop/app/kafka_2.11-0.9.0.0
export PATH=$KAFKA_HOME:$PATH
source ~/.bash/profile
--------------------------------------------
$KAFKA_HOME/config/server.properties
broker.id=0
listeners
host.name
log.dirs(不要tmp,不然重启会日志清除) 可在app/tmp下面创建一个kafka-logs目录,修改为/home/hadoop/app/tmp/kafka-logs
zookeeper.connect(zookeeper的地址)
配置完成!
-----------------------------------------------
启动Kafka
kafka-server-start.sh
USAGE: /home/hadoop/app/kafka_2.11-0.9.0.0/bin/kafka-server-start.sh [-daemon] server.properties [--override property=value]*
kafka-server-start.sh $KAFKA_HOME/config/server.properties
--测试:jps查看到kafka名称的进程,输入jsp -m可以看到所以进程绑定的配置文件的位置
创建topic: zk
kafka-topics.sh --create --zookeeper hadoop000:2181 --replication-factor 1 --partitions 1 --topic hello_topic
---关键参数解释:replication-factor 1代表副本件数为1,partitions 1代表一个分区, --topic代表topic的名称
查看所有topic(用于测试topic 是否创建成功)
kafka-topics.sh --list --zookeeper hadoop000:2181
发送消息: broker(或者说是生产消息)
kafka-console-producer.sh --broker-list hadoop000:9092 --topic hello_topic
消费消息: zk
kafka-console-consumer.sh --zookeeper hadoop000:2181 --topic hello_topic --from-beginning
--from-beginning的使用
查看所有topic的详细信息:kafka-topics.sh --describe --zookeeper hadoop000:2181
查看指定topic的详细信息:kafka-topics.sh --describe --zookeeper hadoop000:2181 --topic hello_topic
单节点多broker
server-1.properties
log.dirs=/home/hadoop/app/tmp/kafka-logs-1
listeners=PLAINTEXT://:9093
broker.id=1
server-2.properties
log.dirs=/home/hadoop/app/tmp/kafka-logs-2
listeners=PLAINTEXT://:9094
broker.id=2
server-3.properties
log.dirs=/home/hadoop/app/tmp/kafka-logs-3
listeners=PLAINTEXT://:9095
broker.id=3
kafka-server-start.sh -daemon $KAFKA_HOME/config/server-1.properties &
kafka-server-start.sh -daemon $KAFKA_HOME/config/server-2.properties &
kafka-server-start.sh -daemon $KAFKA_HOME/config/server-3.properties &
kafka-topics.sh --create --zookeeper hadoop000:2181 --replication-factor 3 --partitions 1 --topic my-replicated-topic
注意副本数建议大于或者等于kafka的启动数量
kafka-console-producer.sh --broker-list hadoop000:9093,hadoop000:9094,hadoop000:9095 --topic my-replicated-topic
kafka-console-consumer.sh --zookeeper hadoop000:2181 --topic my-replicated-topic
kafka-topics.sh --describe --zookeeper hadoop000:2181 --topic my-replicated-topic
整合Flume和Kafka的综合使用
avro-memory-kafka.conf
avro-memory-kafka.sources = avro-source
avro-memory-kafka.sinks = kafka-sink
avro-memory-kafka.channels = memory-channel
avro-memory-kafka.sources.avro-source.type = avro
avro-memory-kafka.sources.avro-source.bind = hadoop000
avro-memory-kafka.sources.avro-source.port = 44444
avro-memory-kafka.sinks.kafka-sink.type = org.apache.flume.sink.kafka.KafkaSink
avro-memory-kafka.sinks.kafka-sink.brokerList = hadoop000:9092
avro-memory-kafka.sinks.kafka-sink.topic = hello_topic
avro-memory-kafka.sinks.kafka-sink.batchSize = 5
avro-memory-kafka.sinks.kafka-sink.requiredAcks =1
avro-memory-kafka.channels.memory-channel.type = memory
avro-memory-kafka.sources.avro-source.channels = memory-channel
avro-memory-kafka.sinks.kafka-sink.channel = memory-channel
flume-ng agent \
--name avro-memory-kafka \
--conf $FLUME_HOME/conf \
--conf-file $FLUME_HOME/conf/avro-memory-kafka.conf \
-Dflume.root.logger=INFO,console
flume-ng agent \
--name exec-memory-avro \
--conf $FLUME_HOME/conf \
--conf-file $FLUME_HOME/conf/exec-memory-avro.conf \
-Dflume.root.logger=INFO,console
kafka-console-consumer.sh --zookeeper hadoop000:2181 --topic hello_topic