flowable 没配kafka 报错

转载

laokugonggao 2024-09-14 09:46:52

文章标签 flowable 没配kafka 报错 SparkStreaming Kafka Flume kafka 文章分类 架构后端开发

前面说过SparkStreaming分别整合Flume和Kafka，但是在实际开发中往往需要的是SparkStreaming整和Kafka和Flume一起使用。。。

下面就来看一下如何使用。。。

首先来看一下整体的架构图：

外部的软件实时产生一些数据，然后使用Flume实时对这些数据进行采集，之后再利用KafkaSink将数据送到Kafka，做到一个缓存的作用，然后这些消息队列再作为SparkStreaming的数据源完成业务的计算，最后入库或者可视化。。。

接下来看看整体的实现思路：

首先先模拟APPServer实时的产生数据
然后对日志的输入进行一定的配置
输出的级别是info，使用SYSTEM.out的方式在控制台输出，格式为Pattern所示

输出的结果如下：
Flume日志收集

streaming.conf

# Name the components on this agent
agent1.sources = avro-source
agent1.channels = logger-channel
agent1.sinks = log-sink

# Describe/configure the source
agent1.sources.avro-source.type = avro
agent1.sources.avro-source.bind = 0.0.0.0
agent1.sources.avro-source.port = 41414

# Describe the channel
agent1.channels.logger-channel.type = memory
agent1.channels.logger-channel.capacity = 1000
agent1.channels.logger-channel.transactionCapacity = 100

# Describe the sink
agent1.sinks.log-sink.type = logger

# Bind the source and sink to the channel
agent1.sources.avro-source.channels = logger-channel
agent1.sinks.log-sink.channel = logger-channel

接下来要做的是让产生的日志信息和Flume对接，来看一下Flume官方的Log4j.Appender是如何定义的

根据官方进行一下log4j.properties的相关配置：

在pom中添加相关的依赖

<dependency>
            <groupId>org.apache.flume.flume-ng-clients</groupId>
            <artifactId>flume-ng-log4jappender</artifactId>
            <version>1.7.0</version>
        </dependency>

启动Flume

[1@hadoop1 conf]$ flume-ng agent \
> --name agent \  上面配置的agent名字就是agent
> --conf $FLUME_HOME/conf \  系统配置的目录
> --conf-file $FLUME_HOME/conf/streaming.conf  \   系统配置的文件
> -Dflume.root.logger=INFO,console   将日志打印到控制台

到现在已经可以让产生的日志对接到Flume上了，下面要做的是将Flume采集到的数据对接到Kafka上。。。

Flume对接Kafka

首先要启动后台的Kafka进程

kafka-server-start.sh 
-daemon /home/hadoop1/modules/kafka_2.11-0.11.0.2/config/server.properties

下面来创建一个topic，帮助测试案例：

[1@hadoop1 kafka_2.11-0.11.0.2]$ kafka-topics.sh 
--create 
--zookeeper hadoop1:2181 
--replication-factor 1 
--partitions 1 
--topic streaming_topic

创建一个新的flume.conf，帮助将采集到的数据对接到Kafka，首先也是先看看官方文档的介绍

flowable 没配kafka 报错_kafka_02

flume.conf的配置如下：

# Name the components on this agent
agent1.sources = avro-source
agent1.channels = logger-channel
agent1.sinks = kafka-sink

# Describe/configure the source
agent1.sources.avro-source.type = avro
agent1.sources.avro-source.bind = 0.0.0.0
agent1.sources.avro-source.port = 41414

# Describe the channel
agent1.channels.logger-channel.type = memory
agent1.channels.logger-channel.capacity = 1000
agent1.channels.logger-channel.transactionCapacity = 100

# Describe the sink
agent1.sinks.kafka-sink.type = org.apache.flume.sink.kafka.KafkaSink
agent1.sinks.kafka-sink.kafka.topic = streamingtopic
agent1.sinks.kafka-sink.kafka.bootstrap.servers = 192.168.2.161:9092
agent1.sinks.kafka-sink.kafka.flumeBatchSize = 20
agent1.sinks.kafka-sink.kafka.producer.acks = 1
agent1.sinks.kafka-sink.kafka.producer.linger.ms = 1

# Bind the source and sink to the channel
agent1.sources.avro-source.channels = logger-channel
agent1.sinks.kafka-sink.channel = logger-channel

启动Flume，然后启动Kafka消费者：

[1@hadoop1 kafka_2.11-0.11.0.2]$ kafka-console-consumer.sh 
--bootstrap-server hadoop1:9092
--topic streaming_topic

然后运行一下我们的程序即可。。。

SparkStreaming处理Kafka内的数据，进行业务的计算。。

import org.apache.spark.SparkConf
import org.apache.spark.streaming.{Seconds, StreamingContext}
import org.apache.spark.streaming.kafka.KafkaUtils

object KafkaStreamingApp {
  def main(args: Array[String]): Unit = {

    if (args.length != 4){
      System.err.println("Usage: KafkaReceiverWordCount<zkQuorum><group><topics><numThreads>")
    }

    val Array(zkQuorum, group,topics, numThreads) =args

    val sparkConf = new SparkConf().setAppName("KafkaStreamingApp").setMaster("local[*]")
    val ssc = new StreamingContext(sparkConf, Seconds(2))

    val topicMap =topics.split(",").map((_, numThreads.toInt)).toMap

    val messages = KafkaUtils.createStream(ssc, zkQuorum, group, topicMap)

    messages.map(_._2).count().print()

    ssc.start()
    ssc.awaitTermination()
  }
}

配置一下系统参数的配置

flowable 没配kafka 报错_kafka_03