kafka集群

修改kafka中server.properties文件

# 集群中配置跟如下相同# broker 编号,集群内必须唯一broker.id=1# host 地址=127.0.0.1# 端口port=9092# 消息日志存放地址log.dirs=/opt/kafka/log# ZooKeeper 地址,多个用,分隔zookeeper.connect=localhost1:2181,localhost2:2181,localhost3:2181# 启动bin/ -daemon config/server.properties# 异步启动nohup bin/ config/server.properties &

zookeeper集群

修改集群中zoo.cfg文件(zoo_sample.cfg重命名得到)

# 集群中的配置相同# 数据存放目录dataDir=/opt/zookeeper/data# 日志存放目录dataLogDir=/opt/zookeeper/log# 监听端口  clientPort=2181# 集群配置# server.x 分别对应myid文件的内容(每个 zoo.cfg 文件都需要添加)# 2287(通讯端口):3387(选举端口)server.1=localhost1:2287:3387server.2=localhost2:2287:3387server.3=localhost3:2287:3387# 启动./bin/ start

生产者api

1.yml配置

server:  port: 8001  servlet:    context-path: /producerspring:  kafka:    bootstrap-servers: 192.168.11.51:9092,192.168.11.51:9091    producer:      # 这个是kafka生产端最重要的选项      # acks=0 :生产者在成功写入消息之前不会等待任何来自服务器的响应。      # acks=1 :只要集群的首领节点收到消息,生产者就会收到一个来自服务器成功响应。      # acks=-1: 表示分区leader必须等待消息被成功写入到所有的ISR副本(同步副本)中才认为producer请求成功。这种方案提供最高的消息持久性保证,但是理论上吞吐率也是最差的。      acks: 1      # 批量发送数据的配置      batch-size: 16384      # 设置kafka 生产者内存缓存区的大小(32M)      buffer-memory: 33554432      # kafka producer 发送消息失败时的一个重试的次数      retries: 0      # kafka消息的序列化配置      key-serializer: org.apache.kafka.common.serialization.StringSerializer      # 值的反序列化方式      value-serializer: org.apache.kafka.common.serialization.StringSerializer

2.发送

@Resourceprivate KafkaTemplate<String, String> kafkaTemplate;public void sendMessage(String topic, String object) {    ListenableFutureString,     future.addCallback(new ListenableFutureCallbackString,         @Override        public void onSuccess(SendResult<String, String> result) {            log.info("发送消息成功: " + result.toString());        }        @Override        public void onFailure(Throwable throwable) {            log.error("发送消息失败: " + throwable.getMessage());        }    });}

消费者api

1.yml

server:  port: 8002  servlet:    context-path: /consumserspring:  kafka:    bootstrap-servers: 192.168.11.51:9092,192.168.11.51:9091    consumer:      # 该属性指定了消费者在读取一个没有偏移量的分区或者偏移量无效的情况下该作何处理:      # latest(默认值)在偏移量无效的情况下,消费者将从最新的记录开始读取数据(在消费者启动之后生成的记录)      # earliest :在偏移量无效的情况下,消费者将从起始位置读取分区的记录      auto-offset-reset: earliest      # consumer 消息的签收机制:手工签收      enable-auto-commit: false      # 序列化配置      key-deserializer: org.apache.kafka.common.serialization.StringDeserializer      # 值的反序列化方式      value-deserializer: org.apache.kafka.common.serialization.StringDeserializer    listener:      # listner负责ack,每调用一次,就立即commit,可以在代码对每个消息监听设置成不同的      ack-mode: manual      # 在侦听器容器中运行的线程数。      concurrency: 5

2.接收

@KafkaListener(groupId = "group02", topics = "topic02")public void onMessage(ConsumerRecord<String, Object> record, Acknowledgment acknowledgment, Consumer, ?> consumer) {    log.info("消费端接收消息: {}", record.value());    //  手工签收    acknowledgment.acknowledge();}

3.根据不同需求切换ack-mode的模型

# 监听的时候指定 containerFactory 精确配置@KafkaListener(containerFactory = "recordListenerContainerFactory" , topics = "test")/** * 定制 接收 配置 * @return */public ConsumerFactory consumerFactory() {    Map<String, Object> props = new HashMap<>();    props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, kafkaProperties.getBootstrapServers());//        props.put(ConsumerConfig.GROUP_ID_CONFIG, groupId);//        props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, enableAutoCommit);    props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);    props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);    return new DefaultKafkaConsumerFactory<>(props);}@Bean("recordListenerContainerFactory")public ConcurrentKafkaListenerContainerFactory, ?> kafkaListenerContainerFactory(            ConcurrentKafkaListenerContainerFactoryConfigurer configurer,            ConsumerFactory consumerFactory) {    ConcurrentKafkaListenerContainerFactory<Object, Object> factory = new ConcurrentKafkaListenerContainerFactory<>();    // 可以定制 消息队列 接收 的配置    factory.setConsumerFactory(consumerFactory);    //开启批量消费功能    factory.setBatchListener(true);    //不自动启动    factory.setAutoStartup(false);    factory.getContainerProperties().setPollTimeout(1500);    //配置手动提交offset    // MANUAL   当每一批poll()的数据被消费者监听器(ListenerConsumer)处理之后, 手动调用Acknowledgment.acknowledge()后提交    factory.getContainerProperties().setAckMode(ContainerProperties.AckMode.MANUAL);    // COUNT 当每一批poll()的数据被消费者监听器(ListenerConsumer)处理之后,被处理record数量大于等于COUNT时提交,配合 使用    // factory.getContainerProperties().setAckMode(ContainerProperties.AckMode.MANUAL);    // factory.getContainerProperties().setAckCount(5);    // TIME     当每一批poll()的数据被消费者监听器(ListenerConsumer)处理之后,距离上次提交时间大于PollTimeout时提交    // factory.getContainerProperties().setAckMode(ContainerProperties.AckMode.TIME);    // COUNT_TIME   TIME | COUNT 有一个条件满足时提交    // factory.getContainerProperties().setAckMode(ContainerProperties.AckMode.COUNT_TIME);    // BATCH    当每一批poll()的数据被消费者监听器(ListenerConsumer)处理之后提交    // factory.getContainerProperties().setAckMode(ContainerProperties.AckMode.BATCH);    // RECORD   当每一条记录被消费者监听器(ListenerConsumer)处理之后提交    // factory.getContainerProperties().setAckMode(ContainerProperties.AckMode.RECORD);    // MANUAL_IMMEDIATE 手动调用Acknowledgment.acknowledge()后立即提交    // factory.getContainerProperties().setAckMode(ContainerProperties.AckMode.MANUAL_IMMEDIATE);    configurer.configure(factory, consumerFactory);    return factory;}

kafka延迟队列api

1.思路1:两个监听A、B,A负责处理平常的队列、将需要延迟的队列发到B中,B队列sleep到指定时间后发送到A中当初普通队列消费,代码如下:

@KafkaListener(topics = "myJob")@SendTo("myJob-delay")public String onMessage(ConsumerRecord, ?> cr, Acknowledgment ack) {    // 传入参数    String json = (String) cr.value();    JSONObject data = JSON.parseObject(json);    long msToDelay = data.getLong("msToDelay");    if (msToDelay > 0) {        // 提交        ack.acknowledge();        // 发送到 @SendTo        data.put("until", System.currentTimeMillis() + msToDelay);        return data.toString();    }    // 正常处理    // do real work    // 提交    ack.acknowledge();    return null;}@KafkaListener(topics = "myJob-delay")@SendTo("myJob")public String delayMessage(ConsumerRecord, ?> cr, Acknowledgment ack) throws InterruptedException {    // 传入参数    String json = (String) cr.value();    JSONObject data = JSON.parseObject(json);    Long until = data.getLong("until");    // 阻塞直到 until    while (System.currentTimeMillis() < until) {        Thread.sleep(Math.max(0, until - System.currentTimeMillis()));    }    // 提交    ack.acknowledge();    // 转移到 @SendTo    return json;}

2.思路2:定时任务开启监听消息队列的方法

/** * kafka监听工厂 * 不自动启动 * @param configurer * @return */@Bean("batchFactory")public ConcurrentKafkaListenerContainerFactory, ?> kafkaListenerContainerFactory(        ConcurrentKafkaListenerContainerFactoryConfigurer configurer,        ConsumerFactory consumerFactory) {    ConcurrentKafkaListenerContainerFactory factory = new ConcurrentKafkaListenerContainerFactory<>();    // 可以定制 消息队列 接收 的配置     factory.setConsumerFactory(consumerFactory);    //开启批量消费功能    factory.setBatchListener(true);    //不自动启动    factory.setAutoStartup(false);    configurer.configure(factory, consumerFactory);    return factory;}/** * 定时执行 * containerFactory 属性对应上面bean的名称 * @param recordList * @param acknowledgment */@KafkaListener(id = "test-task", topics = {"test-task"}, groupId = "test-topic", containerFactory = "batchFactory")public void listenFailEmail(List recordList, Acknowledgment acknowledgment) {    for (ConsumerRecord record : recordList) {        log.info("fail email-消息:【{}】。", record.toString());    }    acknowledgment.acknowledge();}@Scheduled(cron = "0 53 20 * * ?")public void startListener() {    log.info("开启监听");    MessageListenerContainer container = registry.getListenerContainer("test-task");    if (!container.isRunning()) {        container.start();    }    //恢复    container.resume();}@Scheduled(cron = "0 54 20 * * ?")public void shutdownListener() {    log.info("关闭监听");    //暂停    MessageListenerContainer container = registry.getListenerContainer("test-task");    container.pause();}

3.思路3:使用延迟队列DelayQueue

@Resourceprivate KafkaTemplate kafkaTemplate;// 集合private static DelayQueue delayQueue = new DelayQueue<>();/** * 监听 * @param json * @return * @throws Throwable */@KafkaListener(topics = {KafkaConstants.KAFKA_TOPIC_MESSAGE_DELAY}, containerFactory = "kafkaContainerFactory")public boolean onMessage(String json) throws Throwable {    try {        DelayMessage delayMessage = JSON.parseObject(json, DelayMessage.class);        if (!isDelay(delayMessage)) {            // 如果接收到消息时,消息已经可以发送了,直接发送到实际的队列            sendActualTopic(delayMessage);        } else {            // 存储            localStorage(delayMessage);        }    } catch (Throwable e) {        log.error("consumer kafka delay message[{}] error!", json, e);        throw e;    }    return true;}/** * 立即执行 * @param delayMessage * @return */private boolean isDelay(DelayMessage delayMessage) {    if (delayMessage.getTime().compareTo(0L) == 0){        return false;    }    return true;}/** * 发送消息 * @param delayMessage */private void sendActualTopic(DelayMessage delayMessage) {    kafkaTemplate.send(delayMessage.getActualTopic(), JSON.toJSONString(delayMessage));}/** * 添加集合 * @param delayMessage */@SneakyThrowsprivate void localStorage(DelayMessage delayMessage) {    delayQueue.add(new MyDelayQueue(delayMessage));}/** * 加载监听 */@PostConstructprivate void handleDelayQueue() {    while (true){        try {            if (delayQueue.size() > 0){                // 取出队列                MyDelayQueue take = delayQueue.take();                if (null == take){                    // 延迟                    Thread.sleep(1000);                    continue;                }                // 将队列发送到队列中                DelayMessage delayMessage = take.getDelayMessage();                sendActualTopic(delayMessage);            }        } catch (InterruptedException e) {            e.printStackTrace();            log.error("handler kafka rocksdb delay message error!", e);        }    }}

server.properties

##每一个broker在集群中的唯一标示,要求是正数。在改变IP地址,不改变broker.id的话不会影响=0# Switch to enable topic deletion or not, default value is false## 是否允许自动创建topic ,若是false,就需要通过命令创建topicdelete.topic.enable=true############################# Socket Server Settings ############################## The address the socket server listens on. It will get the value returned from # .InetAddress.getCanonicalHostName() if not configured.#   FORMAT:#     listeners = listener_name://host_name:port#   EXAMPLE:#     listeners = PLAINTEXT://your.:9092#listeners=PLAINTEXT://:9092##提供给客户端响应的端口port=9092=192.168.1.128# The number of threads handling network requests## broker 处理消息的最大线程数,一般情况下不需要去修改num.network.threads=3# The number of threads doing disk I/O## broker处理磁盘IO 的线程数 ,数值应该大于你的硬盘数.threads=8# The send buffer (SO_SNDBUF) used by the socket server## socket的发送缓冲区,socket的调优参数SO_SNDBUFFsocket.send.buffer.bytes=102400# The receive buffer (SO_RCVBUF) used by the socket server## socket的接受缓冲区,socket的调优参数SO_RCVBUFFsocket.receive.buffer.bytes=102400# The maximum size of a request that the socket server will accept (protection against OOM)## socket请求的最大数值,防止serverOOM,message.max.bytes必然要小于socket.request.max.bytes,会被topic创建时的指定参数覆盖socket.request.max.bytes=104857600############################# Log Basics ############################## A comma seperated list of directories under which to store log files##kafka数据的存放地址,多个地址的话用逗号分割/data/kafka-logs-1,/data/kafka-logs-2log.dirs=/tmp/kafka-logs# The default number of log partitions per topic. More partitions allow greater# parallelism for consumption, but this will also result in more files across# the brokers.##每个topic的分区个数,若是在topic创建时候没有指定的话会被topic创建时的指定参数覆盖num.partitions=1# The number of threads per data directory to be used for log recovery at startup and flushing at shutdown.# This value is recommended to be increased for installations with data dirs located in RAID array.##我们知道segment文件默认会被保留7天的时间,超时的话就##会被清理,那么清理这件事情就需要有一些线程来做。这里就是##用来设置恢复和清理data下数据的线程数量num.recovery.threads.per.data.dir=1############################# Log Retention Policy ############################## The following configurations control the disposal of log segments. The policy can# be set to delete segments after a period of time, or after a given size has accumulated.# A segment will be deleted whenever *either* of these criteria are met. Deletion always happens# from the end of the log.# The minimum age of a log file to be eligible for deletion due to age##segment文件保留的最长时间,默认保留7天(168小时),##超时将被删除,也就是说7天之前的数据将被清理掉。log.retention.hours=168# A size-based retention policy for logs. Segments are pruned from the log as long as the remaining# segments don't drop below log.retention.bytes. Functions independently of log.retention.hours.#log.retention.bytes=1073741824# The maximum size of a log segment file. When this size is reached a new log segment will be created.###日志文件中每个segment的大小,默认为1Glog.segment.bytes=1073741824# The interval at which log segments are checked to see if they can be deleted according# to the retention policies##上面的参数设置了每一个segment文件的大小是1G,那么##就需要有一个东西去定期检查segment文件有没有达到1G,##多长时间去检查一次,就需要设置一个周期性检查文件大小##的时间(单位是毫秒)。=300000############################# Zookeeper ############################## Zookeeper connection string (see zookeeper docs for details).# This is a comma separated host:port pairs, each corresponding to a zk# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002".# You can also append an optional chroot string to the urls to specify the# root directory for all kafka znodes.#zookeeper.connect=localhost:2181##消费者集群通过连接Zookeeper来找到broker。##zookeeper连接服务器地址zookeeper.connect=master:2181,worker1:2181,worker2:2181# Timeout in ms for connecting to zookeeperzookeeper.connection.timeout.ms=6000

项目地址

https://gitee.com/hzy100java/hzy.git