kafka集群扩容后,新的broker上面不会数据进入这些节点,也就是说,这些节点是空闲的;它只有在创建新的topic时才会参与工作。除非将已有的partition迁移到新的服务器上面;
所以需要将一些topic的分区迁移到新的broker上。
kafka-reassign-partitions.sh是kafka提供的用来重新分配partition和replica到broker上的工具
简单实现重新分配需要三步:
- 生成分配计划(generate)
- 执行分配(execute)
- 检查分配的状态(verify)
具体操作如下:
1. 生成分配计划
编写分配脚本:
vi topics-to-move.json
内容如下:
{"topics":
[{"topic":"event_request"}],
"version": 1
}
执行分配计划生成脚本:
kafka-reassign-partitions.sh --zookeeper $ZK_CONNECT --topics-to-move-json-file topics-to-move.json --broker-list "5,6,7,8" --generate
执行结果如下:
[hadoop@sdf-nimbus-perf topic_reassgin]$ kafka-reassign-partitions.sh --zookeeper $ZK_CONNECT --topics-to-move-json-file topics-to-move.json --broker-list "5,6,7,8" --generate
Current partition replica assignment #当前分区的副本分配
{"version":1,"partitions":[{"topic":"event_request","partition":0,"replicas":[3,4]},{"topic":"event_request","partition":1,"replicas":[4,5]}]}
Proposed partition reassignment configuration #建议的分区配置
{"version":1,"partitions":[{"topic":"event_request","partition":0,"replicas":[6,5]},{"topic":"event_request","partition":1,"replicas":[7,6]}]}
Proposed partition reassignment configuration 后是根据命令行的指定的brokerlist生成的分区分配计划json格式。将 Proposed partition reassignment configuration的配置copy保存到一个文件中 topic-reassignment.json
vi topic-reassignment.json
{"version":1,"partitions":[{"topic":"event_request","partition":0,"replicas":[6,5]},{"topic":"event_request","partition":1,"replicas":[7,6]}]}
2. 执行分配(execute)
根据step1 生成的分配计划配置json文件topic-reassignment.json,进行topic的重新分配。
kafka-reassign-partitions.sh --zookeeper $ZK_CONNECT --reassignment-json-file topic-reassignment.json --execute
执行前的分区分布:
[hadoop@sdf-nimbus-perf topic_reassgin]$ le-kafka-topics.sh --describe --topic event_request
Topic:event_request PartitionCount:2 ReplicationFactor:2 Configs:
Topic: event_request Partition: 0 Leader: 3 Replicas: 3,4 Isr: 3,4
Topic: event_request Partition: 1 Leader: 4 Replicas: 4,5 Isr: 4,5
执行后的分区分布:
[hadoop@sdf-nimbus-perf topic_reassgin]$ le-kafka-topics.sh --describe --topic event_request
Topic:event_request PartitionCount:2 ReplicationFactor:4 Configs:
Topic: event_request Partition: 0 Leader: 3 Replicas: 6,5,3,4 Isr: 3,4
Topic: event_request Partition: 1 Leader: 4 Replicas: 7,6,4,5 Isr: 4,5
3. 检查分配的状态
查看分配的状态:正在进行
[hadoop@sdf-nimbus-perf topic_reassgin]$ kafka-reassign-partitions.sh --zookeeper $ZK_CONNECT --reassignment-json-file topic-reassignment.json --verify
Status of partition reassignment:
Reassignment of partition [event_request,0] is still in progress
Reassignment of partition [event_request,1] is still in progress
[hadoop@sdf-nimbus-perf topic_reassgin]$
查看“is still in progress” 状态时的分区,副本分布状态:
发现Replicas有4个哦,说明在重新分配的过程中新旧的副本都在进行工作。
[hadoop@sdf-nimbus-perf topic_reassgin]$ le-kafka-topics.sh --describe --topic event_request
Topic:event_request PartitionCount:2 ReplicationFactor:4 Configs:
Topic: event_request Partition: 0 Leader: 3 Replicas: 6,5,3,4 Isr: 3,4
Topic: event_request Partition: 1 Leader: 4 Replicas: 7,6,4,5 Isr: 4,5
查看分配的状态:分配完成。
[hadoop@sdf-nimbus-perf topic_reassgin]$ kafka-reassign-partitions.sh --zookeeper $ZK_CONNECT --reassignment-json-file topic-reassignment.json --verify
Status of partition reassignment:
Reassignment of partition [event_request,0] completed successfully
Reassignment of partition [event_request,1] completed successfully
查看“completed successfully”状态的分区,副本状态:
已经按照生成的分配计划正确的完成了分区的重新分配。
[hadoop@sdf-nimbus-perf topic_reassgin]$ le-kafka-topics.sh --describe --topic event_request
Topic:event_request PartitionCount:2 ReplicationFactor:2 Configs:
Topic: event_request Partition: 0 Leader: 6 Replicas: 6,5 Isr: 6,5
Topic: event_request Partition: 1 Leader: 7 Replicas: 7,6 Isr: 6,7
------------------------------------------------------------------------------------------------------------------------------------------------------------------
以下为个人经验总结
迁移主要步骤详解见分隔符以上内容
主要有几点补充:
kafka-reassign-partitions.sh --zookeeper 10.162.1.250:2181,10.162.1.251:2181,10.162.1.252:2181 --topics-to-move-json-file topics-to-move.json --broker-list "4,5,6" --generate
如果是需要在6个节点上重新分配,那么是:
kafka-reassign-partitions.sh --zookeeper 10.162.1.250:2181,10.162.1.251:2181,10.162.1.252:2181 --topics-to-move-json-file topics-to-move.json --broker-list "1,2,3,4,5,6" --generate
generate分配,分区会自动分配在新目录上。
generate 时不要写该节点,如果写上该节点,迁移速度很慢,并且可能迁移过程中已经快满的目录很快就100%占用,broker节点相应会挂掉。 正常情况下,迁移的逻辑是先把老的partition日志文件目录拷贝在新目录上,然后在删除老的partition日志目录,但是有时候由于节点负载过高,迁移完成后,老partition日志目录并不会自己删除,确定迁移正确完成后,可以人为的删除即可。 个人的处理方式是:先把部分topic迁移到其他节点上,磁盘利用率降下来后,在操作其他剩余topic重新分配在全部节点(包括先前磁盘利用率很高的节点)。
自动生成分配计划:kafka-reassign-partitions.sh --zookeeper 10.X.X.X:2181,10.X.X.X:2181,10.X.X.X:2181 --topics-to-move-json-file topics-to-move.json --broker-list "1,2,3,4,5,6,7,8,9,10" --generate
执行分配计划:kafka-reassign-partitions.sh --zookeeper 10.X.X.X:2181,10.X.X.X:2181,10.X.X.X:2181 --reassignment-json-file topic-reassignment.json --execute
查看迁移进度:kafka-reassign-partitions.sh --zookeeper 10.X.X.X:2181,10.X.X.X:2181,10.X.X.X:2181 --reassignment-json-file topic-reassignment.json --verify
均衡:kafka-preferred-replica-election.sh --zookeeper 10.X.X.X:2181,10.X.X.X:2181,10.X.X.X:2181