kafka 同城灾备方案

转载

技术博客达人 2024-10-19 18:34:12

文章标签 kafka 同城灾备方案 sed kafka Time 文章分类 架构后端开发

背景：

在进行服务上云的时候发生了性能损耗问题，一步步从网络带宽问题、JDK版本问题、公网时延问题、CPU和内存问题走了很多弯路，最后才定位到kafka-producer，当然这也是由于业务排查过程中对于机房之间时延的几毫秒不重视造成

问题：

对服务本地机房和阿里云压测时，压测结果如下

本地机房	阿里云

TPS：150K	TPS：3K

从可以看到的问题就是阿里云的TPS比本地的机器低好几倍，

解决：

JDK版本统一，外网带宽绝对大于服务历史峰值，公网时延检测，CPU进行了4核8核的比对（不是性能的瓶颈，因为相同线程数和CPU的使用率都没升上去），内存进行了8GB和16GB对比（因为担心对外内存，合着堆外内存也就占了几MB，也没有FullGC）

以上一大通花费了大量时间之后，业务代码里面有一个推送状态回传的操作，需要将消息发送至kafka，之前一直监控了kafka-consumer（consumer是批量拉取的，而且频率不高所以各项指标都很正常）。但是把kafka-producer的监控指标给忽略了，通过方法耗时统计，找到了性能损耗发生在kafka-producer状态回传，以下内容主要是深入的解析kafka-producer的运行原理并评估在双机房下对性能的影响

1.一条消息发送的过程：send阶段→batching阶段→await-send阶段→inflight阶段→retry阶段

max.block.ms：控制KafkaProducer.send()和KafkaProducer.partitionsFor()的阻塞时间，如果消息速度大于producer交付到server端的阻塞时间, 将会抛出异常

batch.size：默认16Kb，太小降低吞吐率

linger.ms：默认0ms没有延迟，正常情况下想要减小请求的数量，合理设置类似TCP中的Nagle算法，当然batch.size优先

2.服务压测下性能比对

（注意到这一步，已经定位到时机房间的时延问题，主要对比时延的影响，以及如何优化）

batch-size	linger-ms	request-count	阿里云/延迟（ms）	星光/延迟（ms）
默认值(16K)	默认值(0ms)	100	327	231
		1000	3516	779
		10000	37102	7474
32K	0ms	100	515	248
		1000	3934	914
		10000	40719	7526
64K	0ms	100	380	118
		1000	3577	695
		10000	37753	6665
64K	5ms	100	468	132
		1000	4014	654
		10000	38457	6524
64K	10ms	100	388	199
		1000	3967	1018
		10000	39671	6338
160K	100ms	100	461	184
		1000	4187	1032
		10000	40235	7253

不要盲目的调大这俩参数，可以看到当batch-size增大对producer有一定的性能提升，但是linger-ms对性能的提升不符合理论依据（本次实验的数据不一定能说明问题）

3.问：但是producer是异步的，怎么调大了batch-size作用还是不大？

答：原因是producer的Record在进入Accumulator之前，首先会先从bootstrap servers获取最新的topic-partition信息，这个过程会阻塞生产线程，直到MetadataRequest完成。所以每一个metadata消耗一个延迟，那么随着消息数量的递增，延时将会被无限放大（这里就在想，怎么来控制metadata的有效期，不要每次都从server端获取就好了）

KafkaProducer.ClusterAndWaitTime waitOnMetadata方法 展开源码

private KafkaProducer.ClusterAndWaitTime waitOnMetadata(String topic, Integer partition, long maxWaitMs) throws InterruptedException {
        this.metadata.add(topic);
        Cluster cluster = this.metadata.fetch();
        Integer partitionsCount = cluster.partitionCountForTopic(topic);
        if (partitionsCount == null || partition != null && partition >= partitionsCount) {
            long begin = this.time.milliseconds();
            long remainingWaitMs = maxWaitMs;

            long elapsed;
            do {
                this.log.trace("Requesting metadata update for topic {}.", topic);
                this.metadata.add(topic);
                int version = this.metadata.requestUpdate();
                this.sender.wakeup();

                try {
                    this.metadata.awaitUpdate(version, remainingWaitMs);
                } catch (TimeoutException var15) {
                    throw new TimeoutException("Failed to update metadata after " + maxWaitMs + " ms.");
                }

                cluster = this.metadata.fetch();
                elapsed = this.time.milliseconds() - begin;
                if (elapsed >= maxWaitMs) {
                    throw new TimeoutException("Failed to update metadata after " + maxWaitMs + " ms.");
                }

                if (cluster.unauthorizedTopics().contains(topic)) {
                    throw new TopicAuthorizationException(topic);
                }

                remainingWaitMs = maxWaitMs - elapsed;
                partitionsCount = cluster.partitionCountForTopic(topic);
            } while(partitionsCount == null);

            if (partition != null && partition >= partitionsCount) {
                throw new KafkaException(String.format("Invalid partition given with record: %d is not in the range [0...%d).", partition, partitionsCount));
            } else {
                return new KafkaProducer.ClusterAndWaitTime(cluster, elapsed);
            }
        } else {
            return new KafkaProducer.ClusterAndWaitTime(cluster, 0L);
        }
    }

~~metadata.max.age.ms：就是这个参数，控制着metadata的有效时间，把它调大就好了~~ （错误，这个意思理解错了）

在一个函数中有这么一个调用关系：

1.把needUpdate置为true
2.唤起sender
3.阻塞awaitUpdate

也就是说当Sender成功更新meatadata之后，version加1。否则会wait个maxWaitMs时间，欲哭无泪丧尽天良，每次都要强制从server端获取过metadata之后才允许往下一步进行。。。。

Metadata的awaitUpdate方法毁灭了我的幻想 展开源码

public synchronized void awaitUpdate(int lastVersion, long maxWaitMs) throws InterruptedException {
        if (maxWaitMs < 0L) {
            throw new IllegalArgumentException("Max time to wait for metadata updates should not be < 0 milliseconds");
        } else {
            long begin = System.currentTimeMillis();

            long elapsed;
            for(long remainingWaitMs = maxWaitMs; this.version <= lastVersion; remainingWaitMs = maxWaitMs - elapsed) {
                AuthenticationException ex = this.getAndClearAuthenticationException();
                if (ex != null) {
                    throw ex;
                }

                if (remainingWaitMs != 0L) {
                    this.wait(remainingWaitMs);
                }

                elapsed = System.currentTimeMillis() - begin;
                if (elapsed >= maxWaitMs) {
                    throw new TimeoutException("Failed to update metadata after " + maxWaitMs + " ms.");
                }
            }

        }
    }