kafka使用外置zk

转载

墨染青衫 2024-10-18 10:54:39

文章标签 kafka使用外置zk sed 缓存初始化 文章分类 架构后端开发

经过序列化，计算分区号之后KafkaProducer主线程调用RecordAccumulator的append方法将消息追加到缓存。并唤醒sender线程处理。流程参照博客：
send方法中更新元数据源码分析KafkaProducer发送消息简要流程

这里sender.wakeup()方法就是最终调用了nioSelector的wakeup方法，selector监听channel事件，会发送阻塞。关于sender线程的【分析博客地址】

1、RecordAccumulator简单介绍：

RecordAccumulator就是producer的缓存，Kafka借助RecordAccumulator实现消息批量发送，从而提高网络性能。具体就是RecordAccumulator内部维护了一个ConcurrentMap, ConcurrentMap<TopicPartition, Deque> batches,发往同一个topic的分区消息放在一个双端队列中，形成一个消息组，这样sender线程就能将这个组的消息批量发送。

问题就是producer如何将消息放入RecordAccumulator缓存中的，消息又是经过怎么样的处理呢？分析源码：

2、RecordAccumulator初始化时机

要分析RecordAccumulator的api，需要分析RecordAccumulator实例化

在KafkaProducer的核心构造方法中初始化，也就是实例化KafkaProducer会为每个producer创建一个自己的RecordAccumulator

this.accumulator = new RecordAccumulator(logContext,
        config.getInt(ProducerConfig.BATCH_SIZE_CONFIG),
        this.totalMemorySize,
        this.compressionType,
        config.getLong(ProducerConfig.LINGER_MS_CONFIG),
        retryBackoffMs,
        metrics,
        time,
        apiVersions,
        transactionM:anager);

通过整理初始化可以知道RecordAccumulator读取了以下配置：

batch.size，buffer.memory，compression.type，linger.ms，retry.backoff.ms，Metrics，

初始化到RecordAccumulator中。注册Metrics，具体详细配置在配置中专门分析。

2、append方法源码分析

public RecordAppendResult append(TopicPartition tp, long timestamp, byte[] key, byte[] value, Header[] headers,
    Callback callback, long maxTimeToBlock) throws InterruptedException {
    // We keep track of the number of appending thread to make sure we do not miss batches in
    // abortIncompleteBatches().
    // 线程数计数器
    appendsInProgress.incrementAndGet();
    ByteBuffer buffer = null;
    if (headers == null)
        headers = Record.EMPTY_HEADERS;
    try {
        // check if we have an in-progress batch
        /**
         * 获取该消息发往的分区 对应的队列 如果不存咋 就新建一个
         */
        Deque<ProducerBatch> dq = getOrCreateDeque(tp);
        synchronized (dq) {
            if (closed)
                throw new KafkaException("Producer closed while send in progress");
            /**
             * 尝试加入队列中
             */
            RecordAppendResult appendResult = tryAppend(timestamp, key, value, headers, callback, dq);
            if (appendResult != null)
                /**
                 * 加入成功
                 */
                return appendResult;
        }

        // we don't have an in-progress record batch try to allocate a new batch
        byte maxUsableMagic = apiVersions.maxUsableProduceMagic();
        int size = Math.max(this.batchSize,
            AbstractRecords.estimateSizeInBytesUpperBound(maxUsableMagic, compression, key, value, headers));
        log.trace("Allocating a new {} byte message buffer for topic {} partition {}", size, tp.topic(),
            tp.partition());
        /**
         * 按照 batch.size或者消息的最大size 加入失败，新开辟一个buffer
         */
        buffer = free.allocate(size, maxTimeToBlock);
        synchronized (dq) {
            // Need to check if producer is closed again after grabbing the dequeue lock.
            if (closed)
                throw new KafkaException("Producer closed while send in progress");

            RecordAppendResult appendResult = tryAppend(timestamp, key, value, headers, callback, dq);
            if (appendResult != null) {
                // Somebody else found us a batch, return the one we waited for! Hopefully this doesn't happen
                // often...
                return appendResult;
            }

            MemoryRecordsBuilder recordsBuilder = recordsBuilder(buffer, maxUsableMagic);
            // 新建一个ProducerBatch
            ProducerBatch batch = new ProducerBatch(tp, recordsBuilder, time.milliseconds());
            FutureRecordMetadata future =
                Utils.notNull(batch.tryAppend(timestamp, key, value, headers, callback, time.milliseconds()));
            // 将消息batch加入到队列尾部
            dq.addLast(batch);
            //4  将消息放到IncompleteBatches Set集合中
            incomplete.add(batch);

            // Don't deallocate this buffer in the finally block as it's being used in the record batch
            buffer = null;

            return new RecordAppendResult(future, dq.size() > 1 || batch.isFull(), true);
        }
    } finally {
        if (buffer != null)
            free.deallocate(buffer);
        appendsInProgress.decrementAndGet();
    }
}

2.1放入缓存的过程：

1,先从map中获取该消息有无对应的队列Deque，没有则新建一个ArrayDeque。

因为ArrayDeque不是线程安全的集合，在加入消息时候，对Deque进行加锁处理。

2,将消息尝试加入到队列中，加入成功则返回。

3，加入失败则新建一个buffer，然后新建一个ProducerBatch对象并调用tryAppend方法将消息封装到ProducerBatch中，然后将消息加入到Deque尾部。

4,将消息放进IncompleteBatches的set集合中，维护未发送完成的ProducerBatch集合。

至于

2.3要对放入缓存进行深入分析还需要分析以下几个问题的源码：

1，返回的RecordAppendResult对象包含了哪些信息，不同条件的返回情况如何

2，步骤2中尝试将消息加入到Deque中的步骤是怎么样的

3，ProducerBatch是如何将消息封装的，有哪些过程

4，这里从源码上看Deque就是一个承载ProducerBatch的容器，使用普通队列就可以，为什么需要使用双端队列，这里主要是是解决发送失败重试问题，当消息发送失败，需要把消息优先放入队列头部重新发送。

问题一： RecordAppendResult

RecordAppendResult就是RecordAccumulator的一个内部类封装了一下信息

future就是KafkaProducer send方法中返回的future，可以获取到返回的元数据RecordMetadata

对于RecordAppendResult这里暂时分析到这里，后期对FutureRecordMetadata会进行分析

public final static class RecordAppendResult {
    //添加消息的future
    public final FutureRecordMetadata future;
    //Batch是否满了
    public final boolean batchIsFull;
    //Batch是否新建
    public final boolean newBatchCreated;

问题二：tryAppend分析

private RecordAppendResult tryAppend(long timestamp, byte[] key, byte[] value, Header[] headers,
                                     Callback callback, Deque<ProducerBatch> deque) {
    //取出队列尾部的ProducerBatch
    ProducerBatch last = deque.peekLast();
    if (last != null) {
        //调用了ProducerBatch的tryAppend方法
        FutureRecordMetadata future = last.tryAppend(timestamp, key, value, headers, callback, time.milliseconds());
        if (future == null)
            last.closeForRecordAppends();
        else
            return new RecordAppendResult(future, deque.size() > 1 || last.isFull(), false);
    }
    return null;
}

向队列中尝试加入消息，这个时候是从队列的尾部取出一个ProducerBatch，然后向该ProducerBatch中加入消息，如果ProducerBatch已经满了，那么就会返回一个null.

问题三：ProducerBatch.tryAppend分析

public FutureRecordMetadata tryAppend(long timestamp, byte[] key, byte[] value, Header[] headers, Callback callback,
    long now) {
    /**
     * 检查是否还有足够的空间追加该消息
     */
    if (!recordsBuilder.hasRoomFor(timestamp, key, value, headers)) {
        return null;
    } else {
        /**
         * 追加消息返回偏移量
         */
        Long checksum = this.recordsBuilder.append(timestamp, key, value, headers);

        this.maxRecordSize = Math.max(this.maxRecordSize, AbstractRecords.estimateSizeInBytesUpperBound(magic(),
            recordsBuilder.compressionType(), key, value, headers));
       //记录追加时间
        this.lastAppendTime = now;
        //封装FutureRecordMetadata
        FutureRecordMetadata future = new FutureRecordMetadata(this.produceFuture, this.recordCount, timestamp,
            checksum, key == null ? -1 : key.length, value == null ? -1 : value.length);
        // we have to keep every future returned to the users in case the batch needs to be
        // split to several new batches and resent.
        thunks.add(new Thunk(callback, future));
        this.recordCount++;
        return future;
    }
}

后续的调用链就比较复杂了

this.recordsBuilder.append(timestamp, key, value, headers);
 //wrapNullable(key) 就是将缓冲区的数据会存放在byte数组中，使用缓冲区
    --->MemoryRecordsBuilder.append()
        ---appendWithOffset();    //计算绝对偏移量
           ---->appendWithOffset();
              --->appendDefaultRecord();
                  --->DefaultRecord.writeTo  将buffer中的数据写进流

问题思考：
1，主线程一边将消息放入缓存，sender线程一边发送，当两者的速率不同时ProducerBatch如何变化
2，同步发送消息时候ProducerBatch如何变化

本文章为转载内容，我们尊重原作者对文章享有的著作权。如有内容错误或侵权问题，欢迎原作者联系我们进行内容更正或删除文章。