上一讲我们初探了KafkaProducer发送消息的大致流程,接下来我们就要详细剖析里面的每个环节。发送消息首先必须要知道Kafka的集群元数据信息,不然不知道该把消息发送到哪儿。
所以这一讲我们就分析KafkaProducer是如何获取到元数据信息的。
- 源码剖析 -
上一讲分析核心流程的时候第一句就是如下代码:
//第一步:阻塞等待获取集群元数据//maxBlockTimeMs 最大等待获取元数据阻塞的时间
ClusterAndWaitTime clusterAndWaitTime = waitOnMetadata(record.topic(), record.partition(), maxBlockTimeMs);
点过去以后,看到如下代码:
private ClusterAndWaitTime waitOnMetadata(String topic, Integer partition, long maxWaitMs) throws InterruptedException {// add topic to metadata topic list if it is not there already and reset expiry//把当前要获取元数据信息的topic放入到metadata对象里面
metadata.add(topic);//从元数据对象里面获取Kafka集群信息
Cluster cluster = metadata.fetch();//从集群信息里面尝试获取当前的topic的分区信息
Integer partitionsCount = cluster.partitionCountForTopic(topic);// Return cached metadata if we have it, and if the record's partition is either undefined// or within the known partition range//如果分区信息不等于null,说明当前的元数据里面已经有topic的信息了if (partitionsCount != null && (partition == null || partition //这儿就直接返回ClusterAndWaitTime对象//cluster代表集群元数据信息,0代表获取元数据花的时间。//因为直接获取到了元数据,所以花在元数据的时间就是0了。return new ClusterAndWaitTime(cluster, 0);//如果再往下执行说明就是要去服务端获取元数据了//获取当前时间long begin = time.milliseconds();//当前还剩可以等待的时间,初始值是最大的等待时间long remainingWaitMs = maxWaitMs;//使用了多少时间long elapsed;// Issue metadata requests until we have metadata for the topic or maxWaitTimeMs is exceeded.// In case we already have cached metadata for the topic, but the requested partition is greater// than expected, issue an update request only once. This is necessary in case the metadata// is stale and the number of partitions for this topic has increased in the meantime.do {
log.trace("Requesting metadata update for topic {}.", topic);//这个操作很重要,做了两个事://1)把标识符needUpdate 设置为了true,获取元数据的时候需要到这个标识符//2)获取到了当前元数据的版本,每更新一次元数据也会让这个元数据的版本号更新int version = metadata.requestUpdate();//TODO 重要//这个操作就是唤醒了sender线程//所以从这儿我们可以看得出来,真正去获取元数据信息的是//sender这个线程,大家应该还记得这个线程在哪儿被初始化并且启动吧?//就是在我们分析KafkaProducer的构造函数的时候。所以我们等一下回过头来就要去分析sender里面的代码。现在就默认这段代码会让sender线程//去向服务端获取元数据,然后我们代码继续往下看
sender.wakeup();try {//TODO 等待更新元数据
metadata.awaitUpdate(version, remainingWaitMs);
} catch (TimeoutException ex) {// Rethrow with original maxWaitMs to prevent logging exception with remainingWaitMsthrow new TimeoutException("Failed to update metadata after " + maxWaitMs + " ms.");
}//获取集群元数据
cluster = metadata.fetch();//计算等待获取元数据花掉的时间
elapsed = time.milliseconds() - begin;//如果等待获取元数据的时间大于了最大等待时间就抛异常if (elapsed >= maxWaitMs)throw new TimeoutException("Failed to update metadata after " + maxWaitMs + " ms.");//如果这个topic没有授权的topic也抛异常if (cluster.unauthorizedTopics().contains(topic))throw new TopicAuthorizationException(topic);
remainingWaitMs = maxWaitMs - elapsed;//从元数据中获取当前topic的分区信息
partitionsCount = cluster.partitionCountForTopic(topic);//只要当前的topic没有分区信息,就说明元数据信息还没有获取到//那么就一直获取
} while (partitionsCount == null);if (partition != null && partition >= partitionsCount) {throw new KafkaException(
String.format("Invalid partition given with record: %d is not in the range [0...%d).", partition, partitionsCount));
}//返回ClusterAndWaitTime对象,这个对象里里面封装了两个值//一个是获取的元数据信息,一个是获取元数据信息花了多少时间。return new ClusterAndWaitTime(cluster, elapsed);
}
分析一下awaitUpdate里面的awaitUpdate方法
public synchronized void awaitUpdate(final int lastVersion, final long maxWaitMs) throws InterruptedException {if (maxWaitMs 0) {throw new IllegalArgumentException("Max time to wait for metadata updates should not be );
}long begin = System.currentTimeMillis();long remainingWaitMs = maxWaitMs;//只要当前的元数据的版本号小于等于上一次的元数据的版本号,就一直进行循环//我们看到这个循环的条件其实能猜得出来,真正去获取元数据的是sender线程//所以那儿如果真的获取到了元数据肯定会更新当前的version的,只要更新了version就会让version > lastVersion。所以只要this.version <= lastVersion就是元数据还没更新的意思while (this.version <= lastVersion) {//如果剩余等待的时间还有if (remainingWaitMs != 0)//这儿就进行wait.我们这儿看到了一个wait//其实我们可以大胆地猜测一下,如果sender线程那儿获取到了元数据//肯定会主动唤醒这个wait的
wait(remainingWaitMs);//已经花了的时间long elapsed = System.currentTimeMillis() - begin;//如果花在等待获取元数据上的时间大于最大的等待时间,抛异常if (elapsed >= maxWaitMs)throw new TimeoutException("Failed to update metadata after " + maxWaitMs + " ms.");//计算剩余可以等待的时间
remainingWaitMs = maxWaitMs - elapsed;
}
}
分析完上面的代码以后就不得不分析Sender代码了。上面的代码其实就是等待Sender分析完了以后的结果,真正去获取元数据的是Sender对象。
大家应该还记得Sender线程是在KafkaProducer初始化的时候启动的,我们再回忆一下当时的代码:
NetworkClient client = new NetworkClient(
new Selector(config.getLong(ProducerConfig.CONNECTIONS_MAX_IDLE_MS_CONFIG), this.metrics, time, "producer", channelBuilder),this.metadata,
clientId,
config.getInt(ProducerConfig.MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION),
config.getLong(ProducerConfig.RECONNECT_BACKOFF_MS_CONFIG),
config.getInt(ProducerConfig.SEND_BUFFER_CONFIG),
config.getInt(ProducerConfig.RECEIVE_BUFFER_CONFIG),this.requestTimeoutMs, time);this.sender = new Sender(client,this.metadata,this.accumulator,
config.getInt(ProducerConfig.MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION) == 1,
config.getInt(ProducerConfig.MAX_REQUEST_SIZE_CONFIG),
(short) parseAcks(config.getString(ProducerConfig.ACKS_CONFIG)),
config.getInt(ProducerConfig.RETRIES_CONFIG),this.metrics,
new SystemTime(),
clientId,this.requestTimeoutMs);
String ioThreadName = "kafka-producer-network-thread" + (clientId.length() > 0 ? " | " + clientId : "");//这个线程里面放入了sender对象//sender对象里面放入了NetworkClient对象this.ioThread = new KafkaThread(ioThreadName, this.sender, true);//启动了这个线程,所以我们接下来分析sender的run方法this.ioThread.start();
因为Sender是一个线程,所以接下来我们就看它的run方法。
public void run() {log.debug("Starting Kafka producer I/O thread.");// main loop, runs until close is called//启动起来以后一直运行while (running) {try {//TODO 核心代码
run(time.milliseconds());
} catch (Exception e) {log.error("Uncaught error in kafka producer I/O thread: ", e);
}
}
}
点击run方法
void run(long now) {//获取集群元数据信息,但是第一次进来的时候是没有元数据数据信息的//也就是说至少也要等到循环第二次运行进来的时候这儿才会有cluste的信息//这个方法里面其余的代码,我们暂时不用关心,//因为下面的代码都是涉及到发送消息,这儿没有获取到元数据所以下面的这些代码也不会真正的执行,我们现在只是关心元数据的事。//我们直接看这个方法的最后一段代码就可以,最后一段代码里面就有更新元数据的操作
Cluster cluster = metadata.fetch();
......//TODO 里面有更新元数据的操作//这个client的就是初始化的时候传进来的NetworkClientthis.client.poll(pollTimeout, now);
}
poll方法是NetWorkClient调用的:
public List poll(long timeout, long now) {//步骤一 :TODO 关键代码,里面封装了获取元数据的请求long metadataTimeout = metadataUpdater.maybeUpdate(now);try {//步骤二:发送获取元数据的请求// TODO 核心代码 这儿的代码,我们这讲先暂时不分析//因为这个里面的代码我们猜也能猜得到肯定就是真正去服务端获取元数据信息的//的代码,里面涉及到了kafka的网络架构,较为复杂我们后面再去分析//这儿我们目前就只需知道它肯定是去服务端获取元数据了就可以了。this.selector.poll(Utils.min(timeout, metadataTimeout, requestTimeoutMs));
} catch (IOException e) {log.error("Unexpected error during I/O", e);
}// process completed actionslong updatedNow = this.time.milliseconds();
List responses = new ArrayList<>();
handleCompletedSends(responses, updatedNow);//步骤三:处理服务端返回来的元数据信息//获取服务端返回来的消息,比如获取元数据的时候//从服务端获取元数据的信息
handleCompletedReceives(responses, updatedNow);
handleDisconnections(responses, updatedNow);
handleConnections();
handleTimedOutRequests(responses, updatedNow);// invoke callbacksfor (ClientResponse response : responses) {if (response.request().hasCallback()) {try {
response.request().callback().onComplete(response);
} catch (Exception e) {log.error("Uncaught error in request completion:", e);
}
}
}return responses;
}
上面的代码我这儿把分为了三个步骤:
(1) 封装获取元数据的请求
(2) 发送获取元数据的请求
(3) 处理服务端发送回来的元数据的响应
接下来我们分析一下这三个步骤:
(1)封装获取元数据的请求
@Overridepublic long maybeUpdate(long now) {
......if (metadataTimeout == 0) {// Beware that the behavior of this method and the computation of timeouts for poll() are// highly dependent on the behavior of leastLoadedNode.
Node node = leastLoadedNode(now);//TODO 更新元数据操作
maybeUpdate(now, node);
}return metadataTimeout;
}private void maybeUpdate(long now, Node node) {if (node == null) {log.debug("Give up sending metadata request since no node is available");// mark the timestamp for no node available to connectthis.lastNoNodeAvailableMs = now;return;
}
String nodeConnectionId = node.idString();//判断是否可以发送请求,其实也就是判断网络是否建立//好,我们当前可以不用深究它是如何建立网络的。所以//我们认为网络已经建立好了即可。if (canSendRequest(nodeConnectionId)) {this.metadataFetchInProgress = true;
MetadataRequest metadataRequest;//是否需要更新所有的topic的元数据if (metadata.needMetadataForAllTopics())//这儿封装的请求是更新所有topic元数据信息的请求//但是一般情况下,kafka是用到哪个topic就会更新哪个topic的元数据//所以这儿我们这次走的是else的分支
metadataRequest = MetadataRequest.allTopics();else//封装更新元数据的请求
metadataRequest = new MetadataRequest(new ArrayList<>(metadata.topics()));//封装网络请求
ClientRequest clientRequest = request(now, nodeConnectionId, metadataRequest);log.debug("Sending metadata request {} to node {}", metadataRequest, node.id());//发送更新元数据的请求,后面的事我们先不用管,因为我们还没有分析到Kafka的网络部分//我们获取元数据的代码分析这儿这儿就可以了
doSend(clientRequest, now);
} else if (connectionStates.canConnect(nodeConnectionId, now)) {// we don't have a connection to this node right now, make onelog.debug("Initialize connection to node {} for sending metadata request", node.id());
initiateConnect(node, now);// If initiateConnect failed immediately, this node will be put into blackout and we// should allow immediately retrying in case there is another candidate node. If it// is still connecting, the worst case is that we end up setting a longer timeout// on the next round and then wait for the response.
} else { // connected, but can't send more OR connecting// In either case, we just need to wait for a network event to let us know the selected// connection might be usable again.this.lastNoNodeAvailableMs = now;
}
}
(2)发送获取元数据的请求
this.selector.poll(xxx)
这个操作是真正向服务端发送请求的代码,里面涉及到了Kafka的网络,我们到现在还没分析Kafka的网络结构,所以先暂时不分析里面的代码,我们这儿就只需要知道这个里面的代码会发送获取元数据的请求到服务端。
(3) 处理服务端发送回来的元数据的响应
private void handleCompletedReceives(List responses, long now) {for (NetworkReceive receive : this.selector.completedReceives()) {
String source = receive.source();
ClientRequest req = inFlightRequests.completeNext(source);
Struct body = parseResponse(receive.payload(), req.request().header());//对获取到的元数据信息做操作if (!metadataUpdater.maybeHandleCompletedReceive(req, now, body))
responses.add(new ClientResponse(req, now, false, body));
}
}
关键代码:
@Overridepublic boolean maybeHandleCompletedReceive(ClientRequest req, long now, Struct body) {short apiKey = req.request().header().apiKey();if (apiKey == ApiKeys.METADATA.id && req.isInitiatedByNetworkClient()) {//处理响应
handleResponse(req.request().header(), body, now);return true;
}return false;
}private void handleResponse(RequestHeader header, Struct body, long now) {this.metadataFetchInProgress = false;//获取响应
MetadataResponse response = new MetadataResponse(body);//获取元数据信息
Cluster cluster = response.cluster();// check if any topics metadata failed to get updated
Map errors = response.errors();if (!errors.isEmpty())log.warn("Error while fetching metadata with correlation id {} : {}", header.correlationId(), errors);// don't update the cluster if there are no valid nodes...the topic we want may still be in the process of being// created which means we will get errors and no nodes until it exists//如果元数据信息里面有值if (cluster.nodes().size() > 0) {//更新元数据信息的值//TODO 重要this.metadata.update(cluster, now);
} else {log.trace("Ignoring empty metadata response with correlation id {}.", header.correlationId());this.metadata.failedUpdate(now);
}
}
接下来我们分析里面的Update方法,其实这个方法大家如果注意的话,KafkaProducer初始化的时候也调用过一次,所以这次进来就是第二次调用了:
public synchronized void update(Cluster cluster, long now) {
Objects.requireNonNull(cluster, "cluster should not be null");this.needUpdate = false;this.lastRefreshMs = now;this.lastSuccessfulRefreshMs = now;//再次进来这个方法里面,把元数据的当前版本递增this.version += 1;//topicExpiryEnabled默认值是trueif (topicExpiryEnabled) {// Handle expiry of topics from the metadata refresh set.for (IteratorLong>> it = topics.entrySet().iterator(); it.hasNext(); ) {
Map.EntryLong> entry = it.next();
long expireMs = entry.getValue();if (expireMs == TOPIC_EXPIRY_NEEDS_UPDATE)
entry.setValue(now + TOPIC_EXPIRY_MS);else if (expireMs <= now) {
it.remove();
log.debug("Removing unused topic {} from the metadata list, expiryMs {} now {}", entry.getKey(), expireMs, now);
}
}
}for (Listener listener: listeners)
listener.onMetadataUpdate(cluster);
String previousClusterId = cluster.clusterResource().clusterId();//needMetadataForAllTopics默认值是false//所以Producer初始化的时候第一次运行要这儿的时候,是不会去获取kafka集群的元数据的。//第二次进来这个方法的时候走的这个条件if (this.needMetadataForAllTopics) {// the listener may change the interested topics, which could cause another metadata refresh.// If we have already fetched all topics, however, another fetch should be unnecessary.this.needUpdate = false;//设置获取到的元数据信息this.cluster = getClusterForCurrentTopics(cluster);
} else {this.cluster = cluster;
}// The bootstrap cluster is guaranteed not to have any useful informationif (!cluster.isBootstrapConfigured()) {
String clusterId = cluster.clusterResource().clusterId();if (clusterId == null ? previousClusterId != null : !clusterId.equals(previousClusterId))
log.info("Cluster ID: {}", cluster.clusterResource().clusterId());
clusterResourceListeners.onUpdate(cluster.clusterResource());
}//TODO 我们发现这儿有个notifyAll,大家想一下为什么这儿会有个唤醒线程的操作呢?//如果大家还得的话,就是前面发送消息的那儿有个线程在等待元数据的更新,有个wait的操作//这儿的意思就是如果获取到了元数据那么强制唤醒之前的wait操作,这样等待元数据更新的操作代码就//继续往下执行了。
notifyAll();
log.debug("Updated cluster metadata version {} to {}", this.version, this.cluster);
}
到此其实已经获取到集群的元数据了。但是获取元数据的这个流程较为复杂,所以我这儿把上面的代码整理了一个流程图,大家结合流程图去分析。
- 总结 -
这一讲我们主要分析了KafkaProducer端是如何获取到服务端的元数据的,里面涉及到Kafka的网络请求,因为要想从服务端获取元数据必须需要网络请求去请求服务端,然后服务端再发送响应回来,但是这次里面我们没有去分析关于网络的东西,因为Kafka的网络是Kafka里面比较核心的模块我们后面单独找模块再去详细分析。
也就是说到目前为止KafkaProducer发送消息的流程里面第一步骤获取元数据的步骤已经分析完了,按道理按照上一讲的节奏应该分析key,value的序列化操作。但是这个操作没啥好分析的,所以下一讲我们分析KafkaProducer是如何根据K、key选择合适的分区的?
也就是会分析这段代码:
//第三步:选择合适的分区int partition = partition(record, serializedKey, serializedValue, cluster);