Kafka核心源码分析-生产者-Sender
- 1.简单介绍
- 2.Sender分析
- 2.1 请求头分析
- 2.2 KSelector
- 2.3 InFlightRequests
- 2.4 MetadataUpdater
- 2.5 NetworkClient
1.简单介绍
我们来了解下Sender线程发送消息的整个流程:首先,它根据RecordAccumulator的缓存情况,筛选出可以向哪些Node节点发送消息,即上一篇的介绍的RecordAccumulator.ready()方法;然后根据生产者与各个节点的连接情况(由NetworkClinet管理),过滤Node节点;之后,生成相应的请求,这里要特别注意的是,每个Node节点只生成一个请求;最后,调用NetworkClient将请求发送出去。
Sender实现了Runnable接口,并运行在单独的ioThread中。Sender的run()方法调用了其重载run(long),这才是Sender线程的核心方法,也是发送消息的关键流程。
针对上面的时序图,我们简单分析下获取的步骤
1.metadata.fetch()从metadata中获取Cluster,即获得元数据信息
2.RecordAccumulator.ready()我们上一篇已经讲过了,根据RecordAccumulator的缓存情况,选出可以向哪些Node节点发送消息,返回RecordAccumulator.ReadyCheckResult对象
3.result.unknownLeadersExist判断上一步拿到的ReadyCheckResult中是否有还不知道Leader节点的存在,那么强制Metadata去更新集群信息
4.从第二步获得的ReadyCheckResult中,拿到readyNodes,遍历此节点,并调用KafkaClient的.ready()判断到此节点的网络I/O方面是否符合发送消息的条件,不符合条件的Node将会从readyNodes集合中删除
5.使用第4步处理后的readyNodes()集合,调用RecordAccumulator.drain()方法,获取待发送的消息集合
6.调用RecordAccumulator.abortExpireBatches()方法处理RecordAccumulator中超时的消息。代码逻辑是,遍历RecordAccumulator中保存的全部RecordBatch,调用RecordBatch.maybeExpire()方法进行处理。如果已超时,则调用RecordBatch.done()方法,会触发自定义Callback,并将RecordBatch从队列中移除,释放ByteBuffer()空间
7.调用Sender.createProduceRequests()方法将待发送的消息封装成ClientRequest
8.KafkaClient.send()方法,将ClientRequest写入KafkaChannel的send字段
9.Kafka.poll()方法,将KafkaChannel.send()字段中保存的ClientRequest发送出去,同时,还会处理服务端发回的响应、处理超时的请求、调用用户自定义Callback等
2.Sender分析
2.1 请求头分析
我们先来看看请求的消息格式
Produce Request(Version:2)的请求头和请求体各个字段的含义如下表所示
名称 | 类型 | 含义 |
api_key | short | API标识 |
api_version | short | API版本号 |
correlation_id | int | 序号,由客户端产生,单调递增,服务端不做任何修改,在Response中会回传给客户端 |
client_id | String | 客户端ID,可为null |
acks | short | 指定服务端响应此请求之前,需要有多少Replica成功复制了此请求的消息。-1表示整个ISR都完成了复制 |
timeout | int | 超时时间,单位是ms |
topic | String | Topic的名称 |
partition | int | Partition编号 |
record_set | byte数组 | 消息的有效负载 |
Produce Response(Version:2)各个字段与含义如下表所示
名称 | 类型 | 含义 |
correlation_id | int | 序号,由客户端产生,单调递增,服务端不做任何修改,在Response中会回传给客户端 |
topic | String | Topic的名称 |
partition | int | Partition编号 |
error_code | short | 异常编号 |
base_offset | long | 服务端为消息生成的偏移量 |
timestamp | long | 服务端产生的时间戳 |
throttle_time_ms | int | 延迟时长,单位是ms |
Sender在发送请求时,会先把请求封装成ClientRequest,ClientRequest里面封装了RequestSend,也就是我们上面的Producer Request的消息格式。Kafka是在Sender.createProduceRequest()方法中处理的。这个方法的核心逻辑如下:
- 将一个NodeId对应的RecordBatch集合,重新整理为produceRecordsByPartition(Map<TopicPartition,ByteBuffer>)和recordsByPartition(Map<TopicPartition,RecordBatch>)两个集合
- 创建RequestSend,RequestSend是真正通过网络I/O发送的对象,其格式符合上面描述的Produce Request(Version:2)协议,其中有效负载就是produceRecordsByPartition中的数据
- 创建RequestCompletionHandler作为回调对象
- 将RequestSend对象和RequestCompletionHandler对象封装进ClientRequest对象中,并将其返回。
下面我们来看下源码
/**
* Create a produce request from the given record batches
* 发送线程为每个目标节点创建一个客户端请求
*/
private ClientRequest produceRequest(long now, int destination, short acks, int timeout, List<RecordBatch> batches) {
//注意:produceRecordsByPartition和recordsByPartition的value是不一样的,一个是ByteBuffer,一个是RecordBatch
Map<TopicPartition, ByteBuffer> produceRecordsByPartition = new HashMap<TopicPartition, ByteBuffer>(batches.size());
final Map<TopicPartition, RecordBatch> recordsByPartition = new HashMap<TopicPartition, RecordBatch>(batches.size());
//步骤1:将RecordBatch列表按照partition进行分类,整理成上述两个集合
for (RecordBatch batch : batches) {
TopicPartition tp = batch.topicPartition;
produceRecordsByPartition.put(tp, batch.records.buffer());
recordsByPartition.put(tp, batch);
}
//步骤2:创建ProduceRequest和RequestSend
ProduceRequest request = new ProduceRequest(acks, timeout, produceRecordsByPartition);
RequestSend send = new RequestSend(Integer.toString(destination),
this.client.nextRequestHeader(ApiKeys.PRODUCE),
request.toStruct());
//步骤3:创建RequestCompletionHandler作为回调对象,其具体逻辑在后面详解
RequestCompletionHandler callback = new RequestCompletionHandler() {
public void onComplete(ClientResponse response) {
handleProduceResponse(response, recordsByPartition, time.milliseconds());
}
};
//创建ClientRequest对象。注意其第二个参数,根据acks配置决定请求时是否需要获取响应
return new ClientRequest(now, acks != 0, send, callback);
}
到这里,ProduceRequest的格式以及创建过程就分析完了。创建在后面的流程中,发送的是RequestSend对象,会将ClientRequest放入InFlightRequests中缓存,当请求收到响应或出现异常时,通过缓存的ClientRequest调用其RequestCompletionHandler对象。
2.2 KSelector
在介绍NetworkClient之前,我们先来了解NetworkClient的整个结构,以及其依赖的其他组件。
上图中的Selectable的接口的实现类Selector是org.apache.kafka.common.network.Selector,为了方便区分,将其简称为KSelector。KSelector使用NIO异步非阻塞模式实现网络I/O操作,KSelector使用一个单独的线程可以管理多条网络连接上的connect,read,write等操作。底层是通过java.nio.channels来完成对消息的处理。
//nio 中的Selector类型,用来监听网络I/O事件
private final java.nio.channels.Selector nioSelector;
//维护了NodeId和KafkaChannel之间的映射关系,表示生产者客户端与各个Node之间的网络连接
//KafkaChannel是在SocketChannel上的又一层封装
private final Map<String, KafkaChannel> channels;
//记录已经完全发送出去的请求
private final List<Send> completedSends;
//记录已经完全接收到的请求
private final List<NetworkReceive> completedReceives;
//暂存一次OP_READ事件处理过程中读取到的全部请求,当一次OP_READ事件处理完成后,
// 会将stagedReceives集合中的请求保存到completeReceives集合中
private final Map<KafkaChannel, Deque<NetworkReceive>> stagedReceives;
private final Set<SelectionKey> immediatelyConnectedKeys;
//记录一次poll过程中发现的断开的链接
private final List<String> disconnected;
//记录一次poll过程中发现的新建立的链接
private final List<String> connected;
//记录向哪些Node发送的请求失败了
private final List<String> failedSends;
//用于创建KafkaChannel的Builder,根据不同的配置创建不同的TransportLayer的子类,然后创建KafkaChannel
//我们可以认为,其创建的KafkaChannel封装的是PlaintextTransportLayer
private final ChannelBuilder channelBuilder;
//用来记录各个连接的使用情况,并据此关闭空闲时间超过connectionsMaxIdleNanos的连接
private final Map<String, Long> lruConnections;
下面介绍KSelector的核心方法。KSelector.connect()方法主要负责创建KafkaChannel,并添加到channels集合中保存。源码如下:
/**
*开始连接到给定的地址,并将连接添加到与给定id关联的此nioSelector
*负责创建KafkaChannel()并添加到channels集合中保存
*/
public void connect(String id, InetSocketAddress address, int sendBufferSize, int receiveBufferSize) throws IOException {
if (this.channels.containsKey(id))
throw new IllegalStateException("There is already a connection for id " + id);
//创建SocketChannel
SocketChannel socketChannel = SocketChannel.open();
//配置成非阻塞模式
socketChannel.configureBlocking(false);
Socket socket = socketChannel.socket();
//设置为长连接
socket.setKeepAlive(true);
//设置sendBufferSize
if (sendBufferSize != Selectable.USE_DEFAULT_BUFFER_SIZE)
socket.setSendBufferSize(sendBufferSize);
//设置receiveBufferSize
if (receiveBufferSize != Selectable.USE_DEFAULT_BUFFER_SIZE)
socket.setReceiveBufferSize(receiveBufferSize);
socket.setTcpNoDelay(true);
boolean connected;
//因为是非阻塞方式,所以SocketChannel.connect()方法时发起一个连接,connect方法在连接正式建立之前就可能返回
//在后面会通过KSelector.finishConnect()确认连接是否真正建立了
try {
connected = socketChannel.connect(address);
} catch (UnresolvedAddressException e) {
socketChannel.close();
throw new IOException("Can't resolve address: " + address, e);
} catch (IOException e) {
socketChannel.close();
throw e;
}
//将这个socketChannel注册到nioSelector上,并关注OP_CONNECT事件
SelectionKey key = socketChannel.register(nioSelector, SelectionKey.OP_CONNECT);
//创建KafkaChannel
KafkaChannel channel = channelBuilder.buildChannel(id, key, maxReceiveSize);
//将KafkaChannel注册到key上
key.attach(channel);
//将NodeId和KafkaChannel绑定,放到channels中管理
this.channels.put(id, channel);
if (connected) {
// OP_CONNECT won't trigger for immediately connected channels
log.debug("Immediately connected to node {}", channel.id());
immediatelyConnectedKeys.add(key);
key.interestOps(0);
}
}
KSelector.send()方法是将之前创建的RequestSend对象缓存到KafkaChannel的send字段中,并开始关注此连接的OP_WRITE事件,并没有发生网络I/O。如果此KafkaChannel的send字段上还保存着一个未完全发送成功的RequestSend请求,为了防止覆盖数据,则会抛出异常。也就是说,每个KafkaChannel一次poll过程中只能发送一个Send请求。
KSelector.poll()方法是真正执行网络I/O的地方,它会调用nioSelector.select()方法等待I/O事件发生。当Channel可写时,发送KafkaChannel.send字段(切记,一次最多只发送一个RequestSend,有时候一个RequestSend也发送不完,需要多次poll才能发送完成);当Channel可读时,读取数据到KafkaChannel.receive,读取一个完整的NetworkReceive后,会将其缓存到stagedReceive中,当全部获取完后将数据转移到completedReceives。最后调用判断是否可以关闭此连接。下面是KSelector.poll()方法的代码
public void poll(long timeout) throws IOException {
if (timeout < 0)
throw new IllegalArgumentException("timeout should be >= 0");
//将上次poll()方法的结果全部清除掉
clear();
if (hasStagedReceives() || !immediatelyConnectedKeys.isEmpty())
timeout = 0;
/* check ready keys */
long startSelect = time.nanoseconds();
//等待I/O事件发生
int readyKeys = select(timeout);
long endSelect = time.nanoseconds();
currentTimeNanos = endSelect;
this.sensors.selectTime.record(endSelect - startSelect, time.milliseconds());
if (readyKeys > 0 || !immediatelyConnectedKeys.isEmpty()) {
//处理I/O事件
pollSelectionKeys(this.nioSelector.selectedKeys(), false);
pollSelectionKeys(immediatelyConnectedKeys, true);
}
//将stagedReceives复制到completedReceives集合中
addToCompletedReceives();
long endIo = time.nanoseconds();
this.sensors.ioTime.record(endIo - endSelect, time.milliseconds());
//关闭长期空闲的连接
maybeCloseOldestConnection();
}
KSelector.pollSelectionKeys()方法时处理I/O操作的核心方法,其中会分别处理OP_CONNECT,OP_READ,OP_WRITE事件,并且会检测连接状态,简单用流程图给大家看下Kafka是如何对Channel的不同状态进行处理的
OP_CONNECT
连接成功后,继续判断是否准备好
OP_READY
OP_WRITE
下面对照看下源码
/**
* 处理I/O操作的核心方法,会处理OP_CONNECT,OP_READ,OP_WRITE事件,并且会检测连接
* @param selectionKeys
* @param isImmediatelyConnected
*/
private void pollSelectionKeys(Iterable<SelectionKey> selectionKeys, boolean isImmediatelyConnected) {
Iterator<SelectionKey> iterator = selectionKeys.iterator();
while (iterator.hasNext()) {
SelectionKey key = iterator.next();
iterator.remove();
//之前创建连接时,将KafkaChannel注册到Key上,就是为了在这里获取
KafkaChannel channel = channel(key);
// register all per-connection metrics at once
sensors.maybeRegisterConnectionMetrics(channel.id());
//更新Lru信息
lruConnections.put(channel.id(), currentTimeNanos);
try {
//对connect方法返回true或OP_CONNECTION事件的处理
/* complete any connections that have finished their handshake (either normally or immediately) */
if (isImmediatelyConnected || key.isConnectable()) {
//finishConnect方法会先检测socketChannel是否建立完成,建立后,会取消对OP_CONNECT事件关注,开始关注OP_READ事件
if (channel.finishConnect()) {
//添加到"已连接"的集合中
this.connected.add(channel.id());
this.sensors.connectionCreated.record();
} else
//连接未完成,则跳过对此Channel的后续处理
continue;
}
//调用KafkaChannel.prepare()方法进行身份验证
/* if channel is not ready finish prepare */
if (channel.isConnected() && !channel.ready())
channel.prepare();
/* if channel is ready read from any connections that have readable data */
if (channel.ready() && key.isReadable() && !hasStagedReceive(channel)) {
//OP_READ事件处理
NetworkReceive networkReceive;
while ((networkReceive = channel.read()) != null)
//上面channel.read()读取到一个完整的NetworkReceive,则将其添加到stagedReceives中保存
//若读取不到一个完成的NetworkReceive,则返回null,下次处理OP_READ事件时,继续读取,直至读取到一个完整的NetworkReceive
addToStagedReceives(channel, networkReceive);
}
/* if channel is ready write to any sockets that have space in their buffer and for which we have data */
//OP_WRITE
if (channel.ready() && key.isWritable()) {
Send send = channel.write();
//上面的channel.write()方法将KafkaChannel.send字段发送出去,如果未完成发送,则返回null,如果发送完成,则返回send
//并添加到completeSends集合中,待后续处理
if (send != null) {
//添加到completedSends
this.completedSends.add(send);
this.sensors.recordBytesSent(channel.id(), send.size());
}
}
/* cancel any defunct sockets */
//取消任何已经废止的scokets
if (!key.isValid()) {
close(channel);
this.disconnected.add(channel.id());
}
} catch (Exception e) {
//抛出异常,则认为连接关闭,将对应NodeId添加到disconnected集合
String desc = channel.socketDescription();
if (e instanceof IOException)
log.debug("Connection with {} disconnected", desc, e);
else
log.warn("Unexpected error from {}; closing connection", desc, e);
close(channel);
this.disconnected.add(channel.id());
}
}
}
最终,读写操作还是交给了KafkaChannel,下面来分析其相关的方法
//将RequestSend对象缓存到KafkaChannel的send字段中
public void setSend(Send send) {
//如果此KafkaChannel的send字段上还保存这一个未完全发送成功的RequestSend请求,为了防止覆盖数据,则会抛出异常
//也就是说,每个KafkaChannel一次poll过程中只能发送一个Send请求
if (this.send != null)
throw new IllegalStateException("Attempt to begin a send operation with prior send operation still in progress.");
this.send = send;
//并开始关注OP_WRITE事件
this.transportLayer.addInterestOps(SelectionKey.OP_WRITE);
}
//实际去发送数据
private boolean send(Send send) throws IOException {
//如果send在一次write调用时没有发送完,SelectionKey的OP_WRITE事件没有取消,
//还会继续监听此Channel的OP_WRITE事件,直到整个send请求发送完毕才取消
send.writeTo(transportLayer);
//判断是否完成是通过ByteBuffer中是否还有剩余自己来判断的
if (send.completed())
transportLayer.removeInterestOps(SelectionKey.OP_WRITE);
return send.completed();
}
2.3 InFlightRequests
InFlightRequests队列的主要作用是缓存了已经发出去,但是没有收到响应的ClientRequest。其底层是通过一个Map<String,Deque>对象实现的,key是NodeId,value是发送到对应Node的ClientRequest集合。这个队列最主要的用处就根据未响应的消息数,判断此节点的负载能力。
public boolean canSendMore(String node) {
//从HashMap中获得node对应的queue
Deque<ClientRequest> queue = requests.get(node);
//队列的头结点,是否已经发送完成,如果队头的请求迟迟发送不出去,可能是网络出现问题,则不能继续向此Node发送请求
//size比较的是,判断InFlightRequests队列中是否堆积过多请求,如果Node已经堆积了很多未响应的请求,
// 说明这个节点负载可能较大,或是网络连接有问题
return queue == null || queue.isEmpty() ||
(queue.peekFirst().request().completed() && queue.size() < this.maxInFlightRequestsPerConnection);
}
2.4 MetadataUpdater
这个主要是更新集群元数据,在Kafka核心源码分析-生产者-KafkaProducer有讲到,此处不详细展开了。
2.5 NetworkClient
NetworkClient是一个通用的网络客户端实现,不止用于生产者发送消息,也可以用于消费者消费消息,以及服务端Broker之间的通信。我们看下NetworkClient的核心方法。
NetworkClient.ready()方法检查Node是否准备好接收数据。符合以下三个条件即可,则代表Node已经准备好了
- metadata并为处于更新或者需要更新的状态
- 已经成功建立连接,并且连接正常
- InFlightRequests.canSendMore()返回true
NetworkClient.initiateConnect()方法会修改在ClusterConnectionState中的状态,这个类保存了NetworkClien中所有的连接状态,并调用Selector.connect()方法发起连接。
NetworkClient.send()方法主要是将请求设置到KafkaChannel.send字段中,同时将请求添加到InFlightRequests队列中等待。
public void send(ClientRequest request, long now) {
String nodeId = request.request().destination();
//检测是否能像指定的Node发送请求
if (!canSendRequest(nodeId))
throw new IllegalStateException("Attempt to send a request to node " + nodeId + " which is not ready.");
//设置KafkaChannel.send字段,并将请求放入InFlightRequests等待响应
doSend(request, now);
}
//canSend方法
private boolean canSendRequest(String node) {
//已经成功建立连接,且连接正常,
// 检测网络协议正常且是否通过了身份验证
// InFlightRequests.canSendMore()返回true
return connectionStates.isConnected(node) && selector.isChannelReady(node) && inFlightRequests.canSendMore(node);
}
//doSend方法
private void doSend(ClientRequest request, long now) {
request.setSendTimeMs(now);
//等待响应
this.inFlightRequests.add(request);
//放入KafkaChannel的send字段中等待发送
selector.send(request.request());
}
NetworkClient.poll()方法调用KSelector.poll()进行网络I/O,并使用handle*(),对KSelector.poll()产生的各种数据和队列进行处理。
public List<ClientResponse> poll(long timeout, long now) {
//更新Metadata
long metadataTimeout = metadataUpdater.maybeUpdate(now);
try {
//执行IO操作
this.selector.poll(Utils.min(timeout, metadataTimeout, requestTimeoutMs));
} catch (IOException e) {
log.error("Unexpected error during I/O", e);
}
// process completed actions
long updatedNow = this.time.milliseconds();
//响应队列
List<ClientResponse> responses = new ArrayList<>();
//用handle*方法,对KSelector.poll()产生的各种数据和队列进行处理
//处理completeSends队列
handleCompletedSends(responses, updatedNow);
//处理CompletedReceives队列
handleCompletedReceives(responses, updatedNow);
//处理Disconnections队列
handleDisconnections(responses, updatedNow);
//处理connected队列
handleConnections();
//处理InFlightRequest中的超时请求
handleTimedOutRequests(responses, updatedNow);
// invoke callbacks
//循环调用ClientRequest的回调函数
for (ClientResponse response : responses) {
if (response.request().hasCallback()) {
try {
//这个onComplete会调用到Sender.handleProduceResponse()方法
response.request().callback().onComplete(response);
} catch (Exception e) {
log.error("Uncaught error in request completion:", e);
}
}
}
return responses;
}
重点看下handle*()方法的处理逻辑
- handleCompletedSends():我们都知道InFlightRequests中保存的已经发送但是还没有收到响应的请求,CompleteSends保存的是最近一次poll()方法中发送成功的请求,所以CompleteSends列表与InFlightRequests队列中的最后一个请求应该是一致的。
那么这个方法具体是做啥的?答案就是会遍历CompleteSends,把不需要响应的请求从InFlightRequests中去除掉,并向response列表中添加对应的ClientResponse,在ClientResponse中包含一个指向ClientRequest的引用。源码如下
private void handleCompletedSends(List<ClientResponse> responses, long now) {
// if no response is expected then when the send is completed, return it
//遍历completedSends集合
for (Send send : this.selector.completedSends()) {
//获取指定队列的第一个元素
ClientRequest request = this.inFlightRequests.lastSent(send.destination());
//检测请求是否需要响应
if (!request.expectResponse()) {
//将inFlightRequests中对应队列中的第一个请求删除
this.inFlightRequests.completeLastSent(send.destination());
//生成ClientResponse对象,添加到response集合
responses.add(new ClientResponse(request, now, false, null));
}
}
}
- handleCompletedReceives():对已经接收到的消息进行处理,遍历completedReceives队列,并在InFlightRequests中删除对应的ClientRequest,并向response列表中添加对应的ClientResponse。如果是Metadata更新的请求,则会调用metadataUpdater.maybeHandleCompletedReceive()来处理,源码如下
private void handleCompletedReceives(List<ClientResponse> responses, long now) {
//遍历completedReceives
for (NetworkReceive receive : this.selector.completedReceives()) {
//获取返回响应的nodeId
String source = receive.source();
//从inFlightRequests中取出对应的ClientRequest
ClientRequest req = inFlightRequests.completeNext(source);
//解析响应
Struct body = parseResponse(receive.payload(), req.request().header());
//如果是Metadata更新请求的响应
// 那么调用MetadataUpdater.maybeHandleCompleltedReceive()方法处理
// MetadataResponse,其中会更新Metadata中记录的集群元数据,并唤醒所有等待Metadata 更新完成的线程
if (!metadataUpdater.maybeHandleCompletedReceive(req, now, body))
//如果不是MetadataResponse,则创建ClientResponse并添加到responses集合
responses.add(new ClientResponse(req, now, false, body));
}
}
- handleDisconnections():处理已经断开的节点的数据,遍历disconnections列表,将InFlightRequests对应节点的数据清空,并为每个请求都创建ClientResponse添加到response列表中,这里创建ClientResponse会标识为因为连接断开而产生的。源码如下
private void handleDisconnections(List<ClientResponse> responses, long now) {
//更新连接状态,并清理掉InFlightRequests中断开连接的Node对应的ClientRequest
for (String node : this.selector.disconnected()) {
log.debug("Node {} disconnected.", node);
processDisconnection(responses, node, now);
}
// we got a disconnect so we should probably refresh our metadata and see if that broker is dead
if (this.selector.disconnected().size() > 0)
//标识需要更新集群元数据
metadataUpdater.requestUpdate();
}
- handleConnections():处理连接的节点数据,遍历Connected列表,将ConnectionStates中记录的连接的状态修改为CONNECTED。
- handleTimedOutRequests():处理超时的数据,遍历InFlightRequests队列,获取有超时请求的Node集合,之后处理逻辑与handleDisconnections方法一样。
经过一系列的handle*()方法后,NetworkClient.poll()方法产生的全部ClientResponse已经被收集到response列表中。之后遍历responses调用每个ClientResponse的回调方法,如果是异常响应,则请求重发,如果是正常响应则调用每个消息自定义的Callback,也就是RequestCompletionHandler对象,其中onComplete()方法最终调用Sender.handleProcudeResponse()方法。源码如下
private void handleProduceResponse(ClientResponse response, Map<TopicPartition, RecordBatch> batches, long now) {
int correlationId = response.request().request().header().correlationId();
//对于连接断开而产生的ClientResponse,会重试发送请求,若不能重试,则调用其中每条信息的回调
if (response.wasDisconnected()) {
log.trace("Cancelled request {} due to node {} being disconnected", response, response.request()
.request()
.destination());
for (RecordBatch batch : batches.values())
completeBatch(batch, Errors.NETWORK_EXCEPTION, -1L, Record.NO_TIMESTAMP, correlationId, now);
} else {
log.trace("Received produce response from node {} with correlation id {}",
response.request().request().destination(),
correlationId);
// if we have a response, parse it
if (response.hasResponse()) {
ProduceResponse produceResponse = new ProduceResponse(response.responseBody());
for (Map.Entry<TopicPartition, ProduceResponse.PartitionResponse> entry : produceResponse.responses().entrySet()) {
TopicPartition tp = entry.getKey();
ProduceResponse.PartitionResponse partResp = entry.getValue();
Errors error = Errors.forCode(partResp.errorCode);
RecordBatch batch = batches.get(tp);
//调用completeBatch方法处理
completeBatch(batch, error, partResp.baseOffset, partResp.timestamp, correlationId, now);
}
this.sensors.recordLatency(response.request().request().destination(), response.requestLatencyMs());
this.sensors.recordThrottleTime(response.request().request().destination(),
produceResponse.getThrottleTime());
} else {
// this is the acks = 0 case, just complete all requests
//不需要响应的请求,直接调用completeBatch()处理
for (RecordBatch batch : batches.values())
completeBatch(batch, Errors.NONE, -1L, Record.NO_TIMESTAMP, correlationId, now);
}
}
}
本文参考:
1.Apache Kafka源码剖析,徐郡明