单数据源多出口

流程图
【Flume】单数据源多出口案列(选择器)_hadoop

案例需求:使用Flume监控文件变动,Flume将文件传输给Flume2和Flume3,Flume2将接收到的文件存储到HDFS,Flume3将接收到的文件存储到本地文件系统
数据分析图

【Flume】单数据源多出口案列(选择器)_hdfs_02

Flume1的配置

a1.sources = r1
a1.sinks = k1 k2
a1.channels = c1 c2

# 将数据同时传给俩个channels
a1.sourcesr1.selector.type = replicating

# exec监控文本
a1.sources.r1.type = exec
a1.sources.r1.command = tail -F /home/data/flume/log

# 配置传输IP及端口号
a1.sinks.k1.type = avro
a1.sinks.k1.hostname = d
a1.sinks.k1.port = 4141

a1.sinks.k2.type = avro
a1.sinks.k2.hostname = d
a1.sinks.k2.port = 4142

# 配置通道类型
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

a1.channels.c2.type = memory
a1.channels.c2.capacity = 1000
a1.channels.c2.teansactionCapacity = 100

# 连接
a1.sources.r1.channels = c1 c2
a1.sinks.k1.channel = c1
a1.sinks.k2.channel = c2

Flume2的配置

a2.sources = r1
a2.sinks = k1
a2.channels = c1

# 接收数据
a2.sources.r1.type = avro
a2.sources.r1.bind = d
a2.sources.r1.port = 4141

# 将数据传到hdfs上
a2.sinks.k1.type = hdfs
a2.sinks.k1.hdfs.path = hdfs://d:9000/flume/%Y-%m_d/%H
a2.sinks.k1.hdfs.filePrefix = event-
a2.sinks.k1.hdfs.fileSuffix = -log
a2.sinks.k1.hdfs.fileType = DataStream
a2.sinks.k1.hdfs.useLocalTimeStamp = true
a2.sinks.k1.hdfs.batchSize = 100
a2.sinks.k1.hdfs.round = true
a2.sinks.k1.hdfs.roundValue = 1
a2.sinks.k1.hdfs.roundUnit = hour
a2.sinks.k1.hdfs.rollInterval = 600
a2.sinks.k1.hdfs.rollSize = 134217700
a2.sinks.k1.hdfs.rollCount = 0

a2.channels.c1.type = memory
a2.channels.c1.capacity = 1000
a2.channels.c1.transactionCapacity = 100

a2.sources.r1.channels = c1
a2.sinks.k1.channel = c1

Flume3的配置

a3.sources = r1
a3.sinks = k1
a3.channels = c1

a3.sources.r1.type = avro
a3.sources.r1.bind = d
a3.sources.r1.port = 4142

a3.channels.c1.type = memory
a3.channels.c1.capacity = 1000
a3.channels.c1.transactionCapacity = 100

a3.sinks.k1.type = file_roll
a3.sinks.k1.sink.directory = /home/data/flume/logs

a3.sources.r1.channels = c1
a3.sinks.k1.channel = c1

测试

#开启三个Flume,切记Flume1最后一个开启
/home/flume/bin/flume-ng agent --conf conf --name a1 --conf-file /home/flume/job/select/flume1 -Dflume.root.logger=INFO,console
/home/flume/bin/flume-ng agent --conf conf --name a2 --conf-file /home/flume/job/select/flume2 -Dflume.root.logger=INFO,console
/home/flume/bin/flume-ng agent --conf conf --name a3 --conf-file /home/flume/job/select/flume3 -Dflume.root.logger=INFO,console

在Flume1监控的文本里面传入数据
echo '123' > log
【Flume】单数据源多出口案列(选择器)_hadoop_03
【Flume】单数据源多出口案列(选择器)_flume_04
查看HDFS目录和本地目录
【Flume】单数据源多出口案列(选择器)_数据_05
【Flume】单数据源多出口案列(选择器)_数据_06