集群上新安装并启动了3个kafka Broker,代码打包上传至集群,运行后发现一直消费不到数据,
本地idea中debug后发现,程序一直阻塞在如下程序中,陷入了死循环。
/**
* Block until the coordinator for this group is known and is ready to receive requests.
* 等待直到我们和服务端的GroupCoordinator取得连接
*/
public void ensureCoordinatorReady() {
while (coordinatorUnknown()) {//无法获取GroupCoordinator
RequestFuture<Void> future = sendGroupCoordinatorRequest();//发送请求
client.poll(future);//同步等待异步调用的结果
if (future.failed()) {
if (future.isRetriable())
client.awaitMetadataUpdate();
else
throw future.exception();
} else if (coordinator != null && client.connectionFailed(coordinator)) {
// we found the coordinator, but the connection has failed, so mark
// it dead and backoff before retrying discovery
coordinatorDead();
time.sleep(retryBackoffMs);//等待一段时间,然后重试
}
}
}
流程大概说就是
- consumer会从集群中选取一个broker作为coordinator
- 然后group中的consumer会向coordinator发请求申请成为consumergroup中的leader
- 最后有1个consumer会成为consumerLeader ,其他consumer成为follower
- consumerLeader做分区分配任务,同步给coordinator
- consumerFollower从coordinator同步分区分配数据
问题出现在第一步,意思就是说Consumer和服务端的GroupCoordinator无法取得连接,所以程序一直在等待状态。
看了下__consumer_offsets 这个topic情况,50个分区全在broker id为152的broker上
bin/kafka-topics.sh --describe --zookeeper localhost:2182 --topic __consumer_offsets
Topic:__consumer_offsets PartitionCount:50 ReplicationFactor:1 Configs:segment.bytes=104857600,cleanup.policy=compact,compression.type=producer
Topic: __consumer_offsets Partition: 0 Leader: 152 Replicas: 152 Isr:152
Topic: __consumer_offsets Partition: 1 Leader: 152 Replicas: 152 Isr:152
Topic: __consumer_offsets Partition: 2 Leader: 152 Replicas: 152 Isr:152
Topic: __consumer_offsets Partition: 3 Leader: 152
......
但是集群上并没有broker id为152的节点,想到该集群kafka节点曾经添加删除过节点,初步断定152是之前的kafka节点,后来该节点去掉后又加入新的节点但是zookeeper中的数据并没有更新。
所以就关闭broker,进入zookeeper客户端,将brokers节点下的topics节点下的__consumer_offsets删除,然后重启broker,注意,此时zookeeper上__consumer_offsets还并没有生成,要开启消费者之后才会生成.
然后再观察__consumer_offsets,分区已经均匀分布在三个broker上面了
bin/kafka-topics.sh --zookeeper localhost:2182 --describe --topic __consumer_offsets
Topic:__consumer_offsets PartitionCount:50 ReplicationFactor:3 Configs:segment.bytes=104857600,cleanup.policy=compact,compression.type=producer
Topic: __consumer_offsets Partition: 0 Leader: 420 Replicas: 420,421,422 Isr: 420,422,421
Topic: __consumer_offsets Partition: 1 Leader: 421 Replicas: 421,422,420 Isr: 422,420,421
Topic: __consumer_offsets Partition: 2 Leader: 422 Replicas: 422,420,421 Isr: 422,420,421
Topic: __consumer_offsets Partition: 3 Leader: 420 Replicas: 420,422,421 Isr: 420,422,421
Topic: __consumer_offsets Partition: 4 Leader: 421 Replicas: 421,420,422 Isr: 420,422,421
Topic: __consumer_offsets Partition: 5 Leader: 422 Replicas: 422,421,420 Isr: 422,420,421
Topic: __consumer_offsets Partition: 6 Leader: 420 Replicas: 420,421,422 Isr: 420,422,421
Topic: __consumer_offsets Partition: 7 Leader: 421 Replicas: 421,422,420 Isr: 422,420,421
Topic: __consumer_offsets Partition: 8 Leader: 422 Replicas: 422,420,421 Isr: 422,420,421
Topic: __consumer_offsets Partition: 9 Leader: 420 Replicas: 420,422,421 Isr: 420,422,421
Topic: __consumer_offsets Partition: 10 Leader: 421 Replicas: 421,420,422 Isr: 420,422,421
Topic: __consumer_offsets Partition: 11 Leader: 422 Replicas: 422,421,420 Isr: 422,420,421
Topic: __consumer_offsets Partition: 12 Leader: 420 Replicas: 420,421,422 Isr: 420,422,421
Topic: __consumer_offsets Partition: 13 Leader: 421 Replicas: 421,422,420 Isr: 422,420,421
Topic: __consumer_offsets Partition: 14 Leader: 422 Replicas: 422,420,421 Isr: 422,420,421
Topic: __consumer_offsets Partition: 15 Leader: 420 Replicas: 420,422,421 Isr: 420,422,421
Topic: __consumer_offsets Partition: 16 Leader: 421 Replicas: 421,420,422 Isr: 420,422,421
Topic: __consumer_offsets Partition: 17 Leader: 422 Replicas: 422,421,420 Isr: 422,420,421
Topic: __consumer_offsets Partition: 18 Leader: 420 Replicas: 420,421,422 Isr: 420,422,421
Topic: __consumer_offsets Partition: 19 Leader: 421 Replicas: 421,422,420 Isr: 422,420,421
Topic: __consumer_offsets Partition: 20 Leader: 422 Replicas: 422,420,421 Isr: 422,420,421
Topic: __consumer_offsets Partition: 21 Leader: 420 Replicas: 420,422,421 Isr: 420,422,421
Topic: __consumer_offsets Partition: 22 Leader: 421 Replicas: 421,420,422 Isr: 420,422,421
Topic: __consumer_offsets Partition: 23 Leader: 422 Replicas: 422,421,420 Isr: 422,420,421
Topic: __consumer_offsets Partition: 24 Leader: 420 Replicas: 420,421,422 Isr: 420,422,421
Topic: __consumer_offsets Partition: 25 Leader: 421 Replicas: 421,422,420 Isr: 422,420,421
Topic: __consumer_offsets Partition: 26 Leader: 422 Replicas: 422,420,421 Isr: 422,420,421
Topic: __consumer_offsets Partition: 27 Leader: 420 Replicas: 420,422,421 Isr: 420,422,421
Topic: __consumer_offsets Partition: 28 Leader: 421 Replicas: 421,420,422 Isr: 420,422,421
Topic: __consumer_offsets Partition: 29 Leader: 422 Replicas: 422,421,420 Isr: 422,420,421
Topic: __consumer_offsets Partition: 30 Leader: 420 Replicas: 420,421,422 Isr: 420,422,421
Topic: __consumer_offsets Partition: 31 Leader: 421 Replicas: 421,422,420 Isr: 422,420,421
Topic: __consumer_offsets Partition: 32 Leader: 422 Replicas: 422,420,421 Isr: 422,420,421
Topic: __consumer_offsets Partition: 33 Leader: 420 Replicas: 420,422,421 Isr: 420,422,421
Topic: __consumer_offsets Partition: 34 Leader: 421 Replicas: 421,420,422 Isr: 420,422,421
Topic: __consumer_offsets Partition: 35 Leader: 422 Replicas: 422,421,420 Isr: 422,420,421
Topic: __consumer_offsets Partition: 36 Leader: 420 Replicas: 420,421,422 Isr: 420,422,421
Topic: __consumer_offsets Partition: 37 Leader: 421 Replicas: 421,422,420 Isr: 422,420,421
Topic: __consumer_offsets Partition: 38 Leader: 422 Replicas: 422,420,421 Isr: 422,420,421
Topic: __consumer_offsets Partition: 39 Leader: 420 Replicas: 420,422,421 Isr: 420,422,421
Topic: __consumer_offsets Partition: 40 Leader: 421 Replicas: 421,420,422 Isr: 420,422,421
Topic: __consumer_offsets Partition: 41 Leader: 422 Replicas: 422,421,420 Isr: 422,420,421
Topic: __consumer_offsets Partition: 42 Leader: 420 Replicas: 420,421,422 Isr: 420,422,421
Topic: __consumer_offsets Partition: 43 Leader: 421 Replicas: 421,422,420 Isr: 422,420,421
Topic: __consumer_offsets Partition: 44 Leader: 422 Replicas: 422,420,421 Isr: 422,420,421
Topic: __consumer_offsets Partition: 45 Leader: 420 Replicas: 420,422,421 Isr: 420,422,421
Topic: __consumer_offsets Partition: 46 Leader: 421 Replicas: 421,420,422 Isr: 420,422,421
Topic: __consumer_offsets Partition: 47 Leader: 422 Replicas: 422,421,420 Isr: 422,420,421
Topic: __consumer_offsets Partition: 48 Leader: 420 Replicas: 420,421,422 Isr: 420,422,421
Topic: __consumer_offsets Partition: 49 Leader: 421 Replicas: 421,422,420 Isr: 422,420,421
这个时候重启程序,发现已经可以正常消费了,问题解决。
参考资料: