查看所有topic: kafka-topics.sh --zookeeper hadoop102:2181 --list
查看具体的topic : kafka-topics.sh --zookeeper hadoop102:2181 --describe --topic first
创建topic : kafka-topics.sh --zookeeper hadoop102:2181 --create --replication-factor 3
--partitions 1 --topic first
注意:①.partition可以有多个 ②副本的数量不能大于broker的数量
删除topic : kafka-topics.sh --zookeeper hadoop102:2181 --delete --topic first
创建生产者:kafka-console-producer.sh --broker-list hadoop102:9092 --topic first
创建消费者 : kafka-console-consumer.sh --bootstrap-server hadoop102:9092 --topic first
修改分区数 : kafka-topics.sh --zookeeper hadoop102:2181 --alter --topic first --partitions 6
指定消费的partition : kafka-console-consumer.sh --bootstrap-server hadoop102:9092
--topic aa2 --from-beginning --partition 1
注意 :
1.如果不指明partition那么所有的partition都会消费
2.--from-beginning : 从头开始消费
(如果指明了partition那么就是从这个partition的头消费这一个partition)
kafka-console-consumer.sh --bootstrap-server hadoop102:9092
--topic aa2 --partition 0 --offset earliest
创建消费者组 : kafka-console-consumer.sh --bootstrap-server hadoop102:2181 --topic aa2
--consumer-property group.id=组名
注意:只要组名一样就是同一个消费者组
启动broker
启动方式:
> bin/kafka-server-start.sh config/server.properties &
## 或者
> bin/kafka-server-start.sh -daemon config/server.properties
启动:
bin/kafka-server-start.sh config/server.properties &
bin/kafka-server-start.sh config/server1.properties &
bin/kafka-server-start.sh config/server2.properties &
检查:
查看logs目录下的server.log,若有以下类似的输出则启动成功。
[Kafka Server 0], started (kafka.server.KafkaServer)
...
[Kafka Server 1], started (kafka.server.KafkaServer)
...
[Kafka Server 2], started (kafka.server.KafkaServer)
关闭broker
> bin/kafka-server-stop.sh
这个命令会检查所有的Kafka broker进程,然后关闭它们。
增加broker
Kafka的集群发现服务是zookeeper来处理的,集群要增加新的broker,只需为新增的broker设置一个唯一的broker.id,然后启动即可。
新增加的broker不会自动分担已有topic的负载,它只会对增加broker后新创建的topic生效。要让新增的broker为已有的topic服务,需要手动调整已有topic的分区分布。后面章节 分区管理 -> 分区重分配 介绍。
主题操作
创建主题
创建主题有多种方式:通过kafka-topic.sh命令行工具创建、API方式创建、直接向zookeeper的/brokers/topics路径下写入节点、auto.create.topics.enable为true时直接发送消息创建topic。通常使用前两种方式创建。
创建主题时用到三个参数:
- --topic :主题名字
- --partitions :分区数量
- --replication-factor :副本因子
复制系数不能超过集群broker数量
> bin/kafka-topics.sh --zookeeper cluster101:2181 --create --topic demo -- partitions 8 --replication-factor 2
删除主题
若删除topic,要确保delete.topic.enable设置为true,不然无法删除topic。
> bin/kafka-topics.sh --zookeeper cluster:2181 --delete --topic del_topic
Topic third is marked for deletion.
Note: This will have no impact if delete.topic.enable is not set to true.
该命令只是标记为待删除状态。删除逻辑由controller在后台默默执行,用户无法感知进度。需要列出topic看是否还在才能确定。
手动删除主题
手动过删除主题前应先关闭broker,集群运行过程中修改zookeeper中的元数据会造成集群的不稳定。
手动删除过程如下:
- 关闭所有broker
- 删除Zookeeper路径/brokers/topics/TOPICNAME,命令:rmr /brokers/topics/TOPICNAME
- 删除每个broker中数据实际的存储分区目录,名字可能是 TOPICNAME-NUM,位置由server.properties文件log.dirs配置指定, 默认是/tmp/kafka-logs
- 重启所有broker
列出所有主题
bin/kafka-topics.sh --zookeeper cluster101:2181/kafka2 --list
查看主题详情
使用--describe参数来查看主题的详情,通过--topic来指定想要查看的topic,不指定则显示所有的topic。
> bin/kafka-topics.sh --zookeeper cluster101:2181 --describe --topic one
Topic:third PartitionCount:8 ReplicationFactor:2 Configs:
Topic: third Partition: 0 Leader: 0 Replicas: 0,2 Isr: 0,2
Topic: third Partition: 1 Leader: 1 Replicas: 1,0 Isr: 1,0
Topic: third Partition: 2 Leader: 2 Replicas: 2,1 Isr: 2,1
Topic: third Partition: 3 Leader: 0 Replicas: 0,1 Isr: 0,1
Topic: third Partition: 4 Leader: 1 Replicas: 1,2 Isr: 1,2
Topic: third Partition: 5 Leader: 2 Replicas: 2,0 Isr: 2,0
Topic: third Partition: 6 Leader: 0 Replicas: 0,2 Isr: 0,2
Topic: third Partition: 7 Leader: 1 Replicas: 1,0 Isr: 1,0
输出信息中显示hipp_raw中有8个分区,副本因子为2,以及每个分区的分配情况。
测试容错:kill掉某个broker进程,查看分区分配情况。
> ps aux | grep server.properties
wen 17960 1.6 19.5 3654572 368720 pts/4 Sl ...
> kill -9 17960
查看描述
> bin/kafka-topics.sh --zookeeper cluster101:2181 --describe --topic one
Topic:third PartitionCount:8 ReplicationFactor:2 Configs:
Topic: third Partition: 0 Leader: 2 Replicas: 0,2 Isr: 2
Topic: third Partition: 1 Leader: 1 Replicas: 1,0 Isr: 1
Topic: third Partition: 2 Leader: 2 Replicas: 2,1 Isr: 2,1
Topic: third Partition: 3 Leader: 1 Replicas: 0,1 Isr: 1
Topic: third Partition: 4 Leader: 1 Replicas: 1,2 Isr: 1,2
Topic: third Partition: 5 Leader: 2 Replicas: 2,0 Isr: 2
Topic: third Partition: 6 Leader: 2 Replicas: 0,2 Isr: 2
Topic: third Partition: 7 Leader: 1 Replicas: 1,0 Isr: 1
以上输出中,第一行是对所有分区的总结,后面的每行给出每个分区的信息。
- "leader" 是负责给定分区所有读写的节点。
- "replicas" 是副本节点列表,不管它们是否是leader或存活。
- "isr" 是同步副本(in-sync)集合,它是replicas中存活且同步的的子集。isr中的副本才能被选举为leader,分区首领是同步副本,跟随者副本需要满足以下条件才能被认为是同步的:
- 过去6s(可配置)内向zookeeper发送过心跳。
- 过去10s(可配置)内从首领那里获取过最新消息。
消费和生产
kafka-console-consumer.sh和kafka-console-producer.sh也许是用户最常用的两个Kafka脚本工具。它们让用户可以方便的在集群中测试consumer和producer。
控制台生产者
> bin/kafka-console-producer.sh --broker-list cluster101:9092 --topic test
控制台消费者
> bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --from-beginning
--consumer-property可以指定消费者参数,如 group.id=test_group。多个参数可以用逗号隔开。
> bin/kafka-console-consumer.sh --broker-list cluster101:9092 --topic test --consumer-property group.id=test_group
分区管理
增加分区
增加分区可以扩展主题容量、降低单个分区的负载。
> bin/kafka-topics.sh --zookeeper cluster101:2181 --alter --topic incr_part --partition 10
由于数据不一致、消息会出现乱序等问题可能产生,主题的分区数无法减少。
如果一定要减少分区数,只有删除整体主题,重新创建。
首选leader的选举
首选leader是指创建topic时选定的leader就是分区的首选leader,创建topic时会在broker之间均衡leader。
在kafka集群中,broker服务器难免会发生宕机或崩溃的情况。这种情况发生后,该broker上的leader副本变为不可用,kafka会将这些分区的leader转移到其它broker上。即使broker重启回来, 其上的副本也只能作为跟随者副本,不能对外提供服务。随着时间的增长,会导致leader的不均衡,集中在一小部分broker上。
可以通过kafka-preferred-replica-election.sh工具来手动触发首选的副本选举。
bin/kafka-preferred-replica-election.sh --zookeeper cluster101:2181
分区重分配
有些时候,你可能需要调整分区的分布:
- 主题的分区在集群中分布不均,导致不均衡的负载
- broker离线造成分区不同步
- 新加入的broker需要从集群里获得负载
可以使用kafka-reassign-partitions.sh工具来调整分区的分布情况:
- 根据broker列表和主题列表生成迁移计划
- 执行迁移计划
- 验证分区重分配的进度和完成情况(可选)
生成迁移计划需要创建一个包含主题清单的JSON文件:
{
"topics": [
{
"topic": "reassign"
}
],
"version": 1
}
生成迁移计划:
> bin/kafka-reassign-partitions.sh --zookeeper cluster101:2181 --topics-to-move-json-file topics.json --broker-list 0,1,2 --generate
Current partition replica assignment
{"version":1,"partitions":[{"topic":"reassign","partition":1,"replicas":[1,0]},{"topic":"reassign","partition":3,"replicas":[1,0]},{"topic":"reassign","partition":6,"replicas":[0,1]},{"topic":"reassign","partition":4,"replicas":[0,1]},{"topic":"reassign","partition":0,"replicas":[0,1]},{"topic":"reassign","partition":7,"replicas":[1,0]},{"topic":"reassign","partition":2,"replicas":[0,1]},{"topic":"reassign","partition":5,"replicas":[1,0]}]}
Proposed partition reassignment configuration
{"version":1,"partitions":[{"topic":"reassign","partition":1,"replicas":[0,2]},{"topic":"reassign","partition":4,"replicas":[0,1]},{"topic":"reassign","partition":6,"replicas":[2,1]},{"topic":"reassign","partition":3,"replicas":[2,0]},{"topic":"reassign","partition":0,"replicas":[2,1]},{"topic":"reassign","partition":7,"replicas":[0,2]},{"topic":"reassign","partition":2,"replicas":[1,0]},{"topic":"reassign","partition":5,"replicas":[1,2]}]}
终端中会输出两个json对象,分别为当前的分区分配情况和建议的分区分配方案。第一个json对象保存起来可以用于回滚,第二个json对象保存起来作为分区分配方案,这里保存为reassign.json。
执行方案:
> bin/kafka-reassign-partitions.sh --zookeeper cluster101:2181 --execute --reassignment-json-file reassign.json
Current partition replica assignment
{"version":1,"partitions":[{"topic":"reassign","partition":1,"replicas":[1,0]},{"topic":"reassign","partition":3,"replicas":[1,0]},{"topic":"reassign","partition":6,"replicas":[0,1]},{"topic":"reassign","partition":4,"replicas":[0,1]},{"topic":"reassign","partition":0,"replicas":[0,1]},{"topic":"reassign","partition":7,"replicas":[1,0]},{"topic":"reassign","partition":2,"replicas":[0,1]},{"topic":"reassign","partition":5,"replicas":[1,0]}]}
Save this to use as the --reassignment-json-file option during rollback
Successfully started reassignment of partitions.
验证结果:
> bin/kafka-reassign-partitions.sh --zookeeper cluster101:2181 --verify --reassignment-json-file reassign.json
Status of partition reassignment:
Reassignment of partition [reassign,0] completed successfully
Reassignment of partition [reassign,1] completed successfully
Reassignment of partition [reassign,3] completed successfully
Reassignment of partition [reassign,7] completed successfully
Reassignment of partition [reassign,5] completed successfully
Reassignment of partition [reassign,6] completed successfully
Reassignment of partition [reassign,2] completed successfully
Reassignment of partition [reassign,4] completed successfully
消费者组操作
查看消费者组
列出老版本消费者组
bin/kafka-consumer-groups.sh --zookeeper cluster101:2181 --list
列出新版本消费者组
bin/kafka-consumer-groups.sh --bootstrap-server cluster101:9092 --list
要获取指定的消费者组详细信息,使用--describe来代替--list,并通过--group来指定特定的消费者组。
bin/kafka-consumer-groups.sh --bootstrap-server cluster101:9092 --describe --group test
删除消费者组
删除老版本消费者组
> bin/kafka-consumer-groups.sh --zookeeper cluster101:2181 --delete --group t_group
新版本消费者组不需要删除,因为它在最后一个成员离开时会自动删除。
吞吐量测试
生产者吞吐量测试
> bin/kafka-producer-perf-test.sh --topic perf_single --num-records 100000 --record-size 200 --throughput -1 --producer-props bootstrap.servers=cluster101:9092 acks=-1
--topic topic名称,本例为test_perf --num-records 总共需要发送的消息数,本例为1000000 --record-size 每个记录的字节数,本例为3000 --throughput 每秒钟发送的记录数,本例为20000 --producer-props bootstrap.servers=cluster101:9092 发送端的配置信息
消费者吞吐量测试
bin/kafka-consumer-perf-test.sh --topic perf --broker-list cluster101:9092 --messages 500000 --message-size 200
其它脚本工具
查看topic当前消息数
> bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list cluster101:9092 --topic demo --time -1
--time -1 表示最大位移
> bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list cluster101:9092 --topic demo --time -2
--time -2 表示最早位移
查询__consumer_offsets
__consumer_offsets这个topic是kafka自己创建的
__consumer_offsets有50个分区,要查询指定的消费者组可以取hashCode再对50取余,即可得到对应的分区。
0.11.0.0版本之前
> bin/kafka-simple-consumer-shell.sh --topic __consumer_offsets --partition 35 --broker-list qxf-bg-03:9092 --formatter 'kafka.coordinator.GroupMetadataManager$OffsetsMessageFormatter'
0.11.0.0版本及之后
> bin/kafka-simple-consumer-shell.sh --topic __consumer_offsets --partition 14 --broker-list cluster101:9092 --formatter 'kafka.coordinator.group.GroupMetadataManager$OffsetsMessageFormatter'