单数据源多出口
流程图
案例需求:使用Flume监控文件变动,Flume将文件传输给Flume2和Flume3,Flume2将接收到的文件存储到HDFS,Flume3将接收到的文件存储到本地文件系统
数据分析图
Flume1的配置
a1.sources = r1
a1.sinks = k1 k2
a1.channels = c1 c2
# 将数据同时传给俩个channels
a1.sourcesr1.selector.type = replicating
# exec监控文本
a1.sources.r1.type = exec
a1.sources.r1.command = tail -F /home/data/flume/log
# 配置传输IP及端口号
a1.sinks.k1.type = avro
a1.sinks.k1.hostname = d
a1.sinks.k1.port = 4141
a1.sinks.k2.type = avro
a1.sinks.k2.hostname = d
a1.sinks.k2.port = 4142
# 配置通道类型
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
a1.channels.c2.type = memory
a1.channels.c2.capacity = 1000
a1.channels.c2.teansactionCapacity = 100
# 连接
a1.sources.r1.channels = c1 c2
a1.sinks.k1.channel = c1
a1.sinks.k2.channel = c2
Flume2的配置
a2.sources = r1
a2.sinks = k1
a2.channels = c1
# 接收数据
a2.sources.r1.type = avro
a2.sources.r1.bind = d
a2.sources.r1.port = 4141
# 将数据传到hdfs上
a2.sinks.k1.type = hdfs
a2.sinks.k1.hdfs.path = hdfs://d:9000/flume/%Y-%m_d/%H
a2.sinks.k1.hdfs.filePrefix = event-
a2.sinks.k1.hdfs.fileSuffix = -log
a2.sinks.k1.hdfs.fileType = DataStream
a2.sinks.k1.hdfs.useLocalTimeStamp = true
a2.sinks.k1.hdfs.batchSize = 100
a2.sinks.k1.hdfs.round = true
a2.sinks.k1.hdfs.roundValue = 1
a2.sinks.k1.hdfs.roundUnit = hour
a2.sinks.k1.hdfs.rollInterval = 600
a2.sinks.k1.hdfs.rollSize = 134217700
a2.sinks.k1.hdfs.rollCount = 0
a2.channels.c1.type = memory
a2.channels.c1.capacity = 1000
a2.channels.c1.transactionCapacity = 100
a2.sources.r1.channels = c1
a2.sinks.k1.channel = c1
Flume3的配置
a3.sources = r1
a3.sinks = k1
a3.channels = c1
a3.sources.r1.type = avro
a3.sources.r1.bind = d
a3.sources.r1.port = 4142
a3.channels.c1.type = memory
a3.channels.c1.capacity = 1000
a3.channels.c1.transactionCapacity = 100
a3.sinks.k1.type = file_roll
a3.sinks.k1.sink.directory = /home/data/flume/logs
a3.sources.r1.channels = c1
a3.sinks.k1.channel = c1
测试
#开启三个Flume,切记Flume1最后一个开启
/home/flume/bin/flume-ng agent --conf conf --name a1 --conf-file /home/flume/job/select/flume1 -Dflume.root.logger=INFO,console
/home/flume/bin/flume-ng agent --conf conf --name a2 --conf-file /home/flume/job/select/flume2 -Dflume.root.logger=INFO,console
/home/flume/bin/flume-ng agent --conf conf --name a3 --conf-file /home/flume/job/select/flume3 -Dflume.root.logger=INFO,console
在Flume1监控的文本里面传入数据echo '123' > log
查看HDFS目录和本地目录