文章目录

  • 一、Kafka读取__consumer_offsets
  • 1.创建topic “test”:
  • 2.使用kafka-console-producer.sh脚本生产消息:
  • 3.验证消息生产成功:
  • 4.创建一个console consumer group:
  • 5.获取该consumer group的group id:
  • 6.查询__consumer_offsets topic所有内容:
  • 7.计算指定consumer group在__consumer_offsets topic中分区信息:
  • 8.获取指定consumer group的位移信息:
  • 二、Kafka 0.11客户端管理工具AdminClient


一、Kafka读取__consumer_offsets

注意:该实验受限于kafka版本,我在kafka_2.11-0.9.0.1和kafka_2.10-0.10.1.0中都成功了,而在较旧的kafka_2.10-0.8.2.0(根本就不会产生__consumer_offsets)和最新的kafka_2.11-0.11.0.0(在第6步的时候报错Exception in thread “main” java.lang.ClassNotFoundException: kafka.coordinator.GroupMetadataManager$OffsetsMessageFormatter)中却无法完成。

  众所周知,由于Zookeeper并不适合大批量的频繁写入操作,新版Kafka(0.8版本之后)已推荐将consumer的位移信息保存在Kafka内部的topic中,即__consumer_offsets topic,并且默认提供了kafka_consumer_groups.sh脚本供用户查看consumer信息。

  不过依然有很多用户希望了解__consumer_offsets topic内部到底保存了什么信息,特别是想查询某些consumer group的位移是如何在该topic中保存的。针对这些问题,本文将结合一个实例探讨如何使用kafka-simple-consumer-shell脚本来查询该内部topic。

1.创建topic “test”:
[hadoop@h153 kafka_2.10-0.10.1.0]$ bin/kafka-topics.sh --create --zookeeper h153:2181 --replication-factor 1 --partitions 2 --topic test
2.使用kafka-console-producer.sh脚本生产消息:

本例中生产了4条消息

[hadoop@h153 kafka_2.10-0.10.1.0]$ bin/kafka-console-producer.sh --broker-list h153:9092 --topic test
3.验证消息生产成功:
[hadoop@h153 kafka_2.10-0.10.1.0]$ bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list h153:9092 --topic test --time -1
test:1:2
test:0:2

参数解释:
--time -1 表示从最新的时间的offset中得到数据条数
输出结果每个字段分别表示topic、partition、untilOffset
上面的输出结果表明总共生产了4条消息

4.创建一个console consumer group:
[hadoop@h153 kafka_2.10-0.10.1.0]$ bin/kafka-console-consumer.sh --bootstrap-server h153:9092 --topic test --from-beginning --new-consumer

在kafka启动窗口你会看见输出这些信息:

[2017-09-26 21:49:54,454] INFO [Group Metadata Manager on Broker 0]: Loading offsets and group metadata from [__consumer_offsets,32] (kafka.coordinator.GroupMetadataManager)
[2017-09-26 21:49:54,457] INFO [Group Metadata Manager on Broker 0]: Finished loading offsets from [__consumer_offsets,32] in 3 milliseconds. (kafka.coordinator.GroupMetadataManager)
[2017-09-26 21:49:54,457] INFO [Group Metadata Manager on Broker 0]: Loading offsets and group metadata from [__consumer_offsets,35] (kafka.coordinator.GroupMetadataManager)
注:默认情况下__consumer_offsets有50个分区

使用bin/kafka-topics.sh --list --zookeeper h153:2181你会看到__consumer_offsets生成

5.获取该consumer group的group id:

后面需要根据该id查询它的位移信息

[hadoop@h153 kafka_2.10-0.10.1.0]$ bin/kafka-consumer-groups.sh --bootstrap-server h153:9092 --list --new-consumer
输出:console-consumer-88985  (记住这个id!)
6.查询__consumer_offsets topic所有内容:

注意:运行下面命令前先要在consumer.properties中设置exclude.internal.topics=false否则该运行该命令后卡住不动,按Ctrl+C也无法结束。

[hadoop@h153 kafka_2.10-0.10.1.0]$ bin/kafka-console-consumer.sh --topic __consumer_offsets --zookeeper h153:2181 --formatter "kafka.coordinator.GroupMetadataManager\$OffsetsMessageFormatter" --consumer.config config/consumer.properties --from-beginning
[console-consumer-88985,test,1]::[OffsetMetadata[2,NO_METADATA],CommitTime 1506433800225,ExpirationTime 1506520200225]
[console-consumer-88985,test,0]::[OffsetMetadata[2,NO_METADATA],CommitTime 1506433800225,ExpirationTime 1506520200225]
[console-consumer-88985,test,1]::[OffsetMetadata[2,NO_METADATA],CommitTime 1506433800326,ExpirationTime 1506520200326]
[console-consumer-88985,test,0]::[OffsetMetadata[2,NO_METADATA],CommitTime 1506433800326,ExpirationTime 1506520200326]
注:第二次运行这个命令的时候得加--delete-consumer-offsets
7.计算指定consumer group在__consumer_offsets topic中分区信息:

  这时候就用到了第5步获取的group.id(本例中是console-consumer-88985)。Kafka会使用下面公式计算该group位移保存在__consumer_offsets的哪个分区上:Math.abs(groupID.hashCode()) % numPartitions

  所以在本例中,对应的分区=Math.abs("console-consumer-88985".hashCode()) % 50 = 39,即__consumer_offsets的分区39保存了这个consumer group的位移信息,下面让我们验证一下。(你可以写个Java小程序直接输出System.out.println(Math.abs(“console-consumer-88985”.hashCode()) % 50);即可知道结果)

8.获取指定consumer group的位移信息:
[hadoop@h153 kafka_2.10-0.10.1.0]$ bin/kafka-simple-consumer-shell.sh --topic __consumer_offsets --partition 39 --broker-list h153:9092 --formatter "kafka.coordinator.GroupMetadataManager\$OffsetsMessageFormatter"
[console-consumer-88985,test,1]::[OffsetMetadata[2,NO_METADATA],CommitTime 1506433800225,ExpirationTime 1506520200225]
[console-consumer-88985,test,0]::[OffsetMetadata[2,NO_METADATA],CommitTime 1506433800225,ExpirationTime 1506520200225]
[console-consumer-88985,test,1]::[OffsetMetadata[2,NO_METADATA],CommitTime 1506433800326,ExpirationTime 1506520200326]
[console-consumer-88985,test,0]::[OffsetMetadata[2,NO_METADATA],CommitTime 1506433800326,ExpirationTime 1506520200326]
注:如果将39换为其他数字则不会有上面的内容输出

二、Kafka 0.11客户端管理工具AdminClient

  很多用户都有直接使用程序API操作Kafka集群的需求。在0.11版本之前,kafka的服务器端代码(即添加kafka_2.**依赖)提供了AdminClientAdminUtils可以提供部分的集群管理操作,但社区官网主页并没有给出这两个类的使用文档。用户只能自行查看源代码和测试用例才能了解具体的使用方法。倘若使用客户端API的话(即添加kafka_clients依赖),用户必须构造特定的请求并自觉编写代码向指定broker创建Socket连接并发送请求,同样是十分繁琐。故Kafka 0.11版本引入了客户端的AdminClient工具。注意,虽然和原先服务器端的AdminClient类同名,但这个工具是属于客户端的,因此需要在程序中添加kafka_clients依赖,比如Gradle的话则增加 compile group: 'org.apache.kafka', name: 'kafka-clients', version: '0.11.0.0'

  该工具提供的所有功能包括:

  • 创建topic
  • 查询所有topic
  • 查询单个topic详情
  • 删除topic
  • 修改config(包括BROKER和TOPIC资源的config)
  • 查询资源config详情
  • 创建ACL
  • 查询ACL详情
  • 删除ACL
  • 查询整个集群详情

  用户使用该类的方式与Java clients的使用方式一致,不用连接Zookeeper,而是直接给定集群中的broker列表。

  下面给出一个该类的测试实例,列出了除ACL操作之外的所有操作样例代码,如下所示:

import java.util.Arrays;
import java.util.Collection;
import java.util.Collections;
import java.util.Map;
import java.util.Properties;
import java.util.Set;
import java.util.concurrent.ExecutionException;
 
import org.apache.kafka.clients.admin.AdminClient;
import org.apache.kafka.clients.admin.AdminClientConfig;
import org.apache.kafka.clients.admin.Config;
import org.apache.kafka.clients.admin.ConfigEntry;
import org.apache.kafka.clients.admin.CreateTopicsResult;
import org.apache.kafka.clients.admin.DescribeClusterResult;
import org.apache.kafka.clients.admin.DescribeConfigsResult;
import org.apache.kafka.clients.admin.DescribeTopicsResult;
import org.apache.kafka.clients.admin.ListTopicsOptions;
import org.apache.kafka.clients.admin.ListTopicsResult;
import org.apache.kafka.clients.admin.NewTopic;
import org.apache.kafka.clients.admin.TopicDescription;
import org.apache.kafka.common.KafkaFuture;
import org.apache.kafka.common.Node;
import org.apache.kafka.common.config.ConfigResource;
 
public class AdminClientTest {
	 
    private static final String TEST_TOPIC = "hui";
 
    public static void main(String[] args) throws Exception {
        Properties props = new Properties();
        props.put(AdminClientConfig.BOOTSTRAP_SERVERS_CONFIG, "h153:9092");
 
        try (AdminClient client = AdminClient.create(props)) {
            createTopics(client);
            describeCluster(client);
            listAllTopics(client);
            describeTopics(client);
            alterConfigs(client);
            describeConfig(client);
            deleteTopics(client);
        }
    }
 
    public static void createTopics(AdminClient client) throws ExecutionException, InterruptedException {
        NewTopic newTopic = new NewTopic(TEST_TOPIC, 1, (short)1);
        CreateTopicsResult ret = client.createTopics(Arrays.asList(newTopic));
        ret.all().get();
        System.out.println("创建成功");
    }
    
    public static void describeCluster(AdminClient client) throws ExecutionException, InterruptedException {
        DescribeClusterResult ret = client.describeCluster();
        System.out.println(String.format("Cluster id-->%s ; controller-->%s", ret.clusterId().get(), ret.controller().get()));
        System.out.print("Current cluster nodes info-->");
        for (Node node : ret.nodes().get()) {
            System.out.println(node);
        }
    }
    
    public static void listAllTopics(AdminClient client) throws ExecutionException, InterruptedException {
        ListTopicsOptions options = new ListTopicsOptions();
        options.listInternal(true); // includes internal topics such as __consumer_offsets
        ListTopicsResult topics = client.listTopics(options);
        Set<String> topicNames = topics.names().get();
        System.out.println("Current topics in this cluster: " + topicNames);
    }
    
    public static void describeTopics(AdminClient client) throws ExecutionException, InterruptedException {
        DescribeTopicsResult ret = client.describeTopics(Arrays.asList(TEST_TOPIC, "__consumer_offsets"));
        Map<String, TopicDescription> topics = ret.all().get();
        for (Map.Entry<String, TopicDescription> entry : topics.entrySet()) {
            System.out.println(entry.getKey() + " ===> " + entry.getValue());
        }
    }
 
    public static void alterConfigs(AdminClient client) throws ExecutionException, InterruptedException {
        Config topicConfig = new Config(Arrays.asList(new ConfigEntry("cleanup.policy", "compact")));
        client.alterConfigs(Collections.singletonMap(
                new ConfigResource(ConfigResource.Type.TOPIC, TEST_TOPIC), topicConfig)).all().get();
    }
    
    public static void describeConfig(AdminClient client) throws ExecutionException, InterruptedException {
        DescribeConfigsResult ret = client.describeConfigs(Collections.singleton(new ConfigResource(ConfigResource.Type.TOPIC, TEST_TOPIC)));
        Map<ConfigResource, Config> configs = ret.all().get();
        for (Map.Entry<ConfigResource, Config> entry : configs.entrySet()) {
            ConfigResource key = entry.getKey();
            Config value = entry.getValue();
            System.out.println(String.format("Resource type: %s, resource name: %s", key.type(), key.name()));
            Collection<ConfigEntry> configEntries = value.entries();
            for (ConfigEntry each : configEntries) {
                System.out.println(each.name() + " = " + each.value());
            }
        }
    }
 
    public static void deleteTopics(AdminClient client) throws ExecutionException, InterruptedException {
        KafkaFuture<Void> futures = client.deleteTopics(Arrays.asList(TEST_TOPIC)).all();
        futures.get();
        System.out.println("删除成功");
    }
}

  最后提一句,由于该类本质上是异步发送请求然后等待操作处理结果,因此每个返回的结果都使用了KafkaFuture进行了封装——KafkaFuture实现了Java的Future接口。既然是Future,那么用户在具体实现上便可以自行决定是异步接收结果还是同步等待。本例中大量使用了KafkaFuture.get(),即同步等待结果。