图解 Kafka 源码之 NetworkClient 网络通信组件架构设计(下篇)
02.2.7 leastLoadedNode()
/**
* Choose the node with the fewest outstanding requests which is at least eligible for connection. This method will
* prefer a node with an existing connection, but will potentially choose a node for which we don't yet have a
* connection if all existing connections are in use. If no connection exists, this method will prefer a node
* with least recent connection attempts. This method will never choose a node for which there is no
* existing connection and from which we have disconnected within the reconnect backoff period, or an active
* connection which is being throttled.
*
* @return The node with the fewest in-flight requests.
*/
@Override
public Node leastLoadedNode(long now){
// 从元数据中获取所有的节点
List<Node> nodes = this.metadataUpdater.fetchNodes();
if (nodes.isEmpty())
throw new IllegalStateException("There are no nodes in the Kafka cluster");
int inflight = Integer.MAX_VALUE;
Node foundConnecting = null;
Node foundCanConnect = null;
Node foundReady = null;
int offset = this.randOffset.nextInt(nodes.size());
for (int i = 0; i < nodes.size(); i++) {
int idx = (offset + i) % nodes.size();
Node node = nodes.get(idx);
// 节点是否可以发送请求
if (canSendRequest(node.idString(), now)) {
// 获取节点的队列大小
int currInflight = this.inFlightRequests.count(node.idString());
// 如果为 0 则返回该节点,负载最小
if (currInflight == 0) {
// if we find an established connection with no in-flight requests we can stop right away
log.trace("Found least loaded node {} connected with no in-flight requests", node);
return node;
} else if (currInflight < inflight) { // 如果队列大小小于最大值
// otherwise if this is the best we have found so far, record that
inflight = currInflight;
foundReady = node;
}
} else if (connectionStates.isPreparingConnection(node.idString())) {
foundConnecting = node;
} else if (canConnect(node, now)) {
if (foundCanConnect == null ||
this.connectionStates.lastConnectAttemptMs(foundCanConnect.idString()) >
this.connectionStates.lastConnectAttemptMs(node.idString())) {
foundCanConnect = node;
}
} else {
log.trace("Removing node {} from least loaded node selection since it is neither ready " +
"for sending or connecting", node);
}
}
// We prefer established connections if possible. Otherwise, we will wait for connections
// which are being established before connecting to new nodes.
if (foundReady != null) {
log.trace("Found least loaded node {} with {} inflight requests", foundReady, inflight);
return foundReady;
} else if (foundConnecting != null) {
log.trace("Found least loaded connecting node {}", foundConnecting);
return foundConnecting;
} else if (foundCanConnect != null) {
log.trace("Found least loaded node {} with no active connection", foundCanConnect);
return foundCanConnect;
} else {
log.trace("Least loaded node selection failed to find an available node");
return null;
}
}
该方法主要是选出一个负载最小的节点,如下图所示:
03 InflightRequests 集合设计
通过上面的代码分析,我们知道「InflightRequests」集合的作用就是缓存已经发送出去但还没有收到响应的 ClientRequest 请求集合。底层是通过 ReqMap<string, Deque<NetworkClient.InFlightRequest>> 实现,其中 key 是 NodeId,value 是发送到对应 Node 的 ClientRequest 请求队列,默认为5个,参数:max.in.flight.requests.per.connection 配置请求队列大小。它为每个连接生成一个双端队列,因此它能控制请求发送的速度。
其作用有以下2个:
- 节点是否正常:收集从「开始发送」到「接收响应」这段时间的请求,来判断要发送的 Broker 节点是否正常,请求和连接是否超时等等,也就是说用来监控发送到哥哥节点请求是否正常。
- 节点的负载情况:Deque 队列到一定长度后就认为某个 Broker 节点负载过高了。
/**
* The set of requests which have been sent or are being sent but haven't yet received a response
* 用来缓存已经发送出去或者正在发送但均还没有收到响应的 ClientRequest 请求集合
*/
final class InFlightRequests {
// 每个连接最大执行中的请求数
private final int maxInFlightRequestsPerConnection;
// 节点 Node 至客户端请求双端队列 Deque<NetworkClient.InFlightRequest> 的映射集合,key为 NodeId, value 是请求队列
private final Map<String, Deque<NetworkClient.InFlightRequest>> requests = new HashMap<>();
/** Thread safe total number of in flight requests. */
// 线程安全的 inFlightRequestCount
private final AtomicInteger inFlightRequestCount = new AtomicInteger(0);
// 设置每个连接最大执行中的请求数
public InFlightRequests(int maxInFlightRequestsPerConnection){
this.maxInFlightRequestsPerConnection = maxInFlightRequestsPerConnection;
}
这里通过「场景驱动」的方式来讲解关键方法,当有新请求需要发送处理时,会在队首入队。而实际被处理的请求,则是从队尾出队,保证入队早的请求先得到处理。
03.1 canSendMore()
先来看下发送条件限制, NetworkClient 调用这个方法用来判断是否还可以向指定 Node 发送请求。
/**
* Can we send more requests to this node?
* @param node Node in question
* @return true iff we have no requests still being sent to the given node
* 判断该连接是否还能发送请求
*/
public boolean canSendMore(String node){
// 获取节点对应的双端队列
Deque<NetworkClient.InFlightRequest> queue = requests.get(node);
// 判断条件 队列为空 || (队首已经发送完成 && 队列中没有堆积更多的请求)
return queue == null || queue.isEmpty() ||
(queue.peekFirst().send.completed() && queue.size() < this.maxInFlightRequestsPerConnection);
}
从上面代码可以看出限制条件,队列虽然可以存储多个请求,但是新的请求要是加进来条件是上一个请求必须发送成功。
条件判断如下:
- queue == null || queue.isEmpty(),队列为空就能发送。
- 判断 queue.peekFirst().send.completed() 队首是否发送完成。
● 如果队首的请求迟迟发送不出去,可能就是网络的原因,因此不能继续向此 Node 发送请求。
● 队首的请求与对应的 KafkaChannel.send 字段指向的是同一个请求,为了避免未发送的消息被覆盖掉,也不能让 KafkaChannel.send 字段指向新请求。
- queue.size() < this.maxInFlightRequestsPerConnection,该条件就是为了判断队列中是否堆积过多请求,如果 Node 已经堆积了很多未响应的请求,说明这个节点出现了网络拥塞,继续再发送请求,则可能会超时。
03.2 add() 入队
/**
* Add the given request to the queue for the connection it was directed to
* 将请求添加到队列首部
*/
public void add(NetworkClient.InFlightRequest request){
// 这个请求要发送到哪个 Broker 节点上
String destination = request.destination;
// 从 requests 集合中根据给定请求的目标 Node 节点获取对应 Deque<ClientRequest> 双端队列 reqs
Deque<NetworkClient.InFlightRequest> reqs = this.requests.get(destination);
// 如果双端队列reqs为null
if (reqs == null) {
// 构造一个双端队列 ArrayDeque 类型的 reqs
reqs = new ArrayDeque<>();
// 将请求目标 Node 节点至 reqs 的映射关系添加到 requests 集合
this.requests.put(destination, reqs);
}
// 将请求 request 添加到 reqs 队首
reqs.addFirst(request);
// 增加计数
inFlightRequestCount.incrementAndGet();
}
03.3 completeNext() 出队最老请求
/**
* Get the oldest request (the one that will be completed next) for the given node
* 取出该连接对应的队列中最老的请求
*/
public NetworkClient.InFlightRequest completeNext(String node){
// 根据给定 Node 节点获取客户端请求双端队列 reqs,并从队尾出队
NetworkClient.InFlightRequest inFlightRequest = requestQueue(node).pollLast();
// 递减计数器
inFlightRequestCount.decrementAndGet();
return inFlightRequest;
}
对比下入队和出队这2个方法,「入队 add()」时是通过 addFirst() 方法添加到队首的,所以队尾的请求是时间最久的,也是应该先处理的,所以「出队 completeNext()」是通过 pollLast(),将队列中时间最久的请求袁术移出进行处理。
03.4 lastSent() 获取最新请求
/**
* Get the last request we sent to the given node (but don't remove it from the queue)
* @param node The node id
*/
public NetworkClient.InFlightRequest lastSent(String node){
return requestQueue(node).peekFirst();
}
03.5 completeLastSent() 出队最新请求
/**
* Complete the last request that was sent to a particular node.
* @param node The node the request was sent to
* @return The request
* 取出该连接对应的队列中最新的请求
*/
public NetworkClient.InFlightRequest completeLastSent(String node){
// 根据给定 Node 节点获取客户端请求双端队列 reqs,并从队首出队
NetworkClient.InFlightRequest inFlightRequest = requestQueue(node).pollFirst();
// 递减计数器
inFlightRequestCount.decrementAndGet();
return inFlightRequest;
}
最后我们来看看「InflightRequests」,表示正在发送的请求,存储着请求发送前的所有信息。
另外它支持生成响应 ClientResponse,当正常收到响应时,completed()会根据响应内容生成对应的 ClientResponse,当连接突然断开后,disconnected() 会生成 ClientResponse 对象,代码如下:
static class InFlightRequest {
// 请求头
final RequestHeader header;
// 这个请求要发送到哪个 Broker 节点上
final String destination;
// 回调函数
final RequestCompletionHandler callback;
// 是否需要进行响应
final boolean expectResponse;
// 请求体
final AbstractRequest request;
// 发送前是否需要验证连接状态
final boolean isInternalRequest; // used to flag requests which are initiated internally by NetworkClient
// 请求的序列化数据
final Send send;
// 发送时间
final long sendTimeMs;
// 请求的创建时间,即 ClientRequest 的创建时间
final long createdTimeMs;
// 请求超时时间
final long requestTimeoutMs;
.....
/**
* 收到响应,回调的时候据响应内容生成 ClientResponse
*/
public ClientResponse completed(AbstractResponse response, long timeMs){
return new ClientResponse(header, callback, destination, createdTimeMs, timeMs,
false, null, null, response);
}
/**
* 当连接突然断开,也会生成 ClientResponse。
*/
public ClientResponse disconnected(long timeMs, AuthenticationException authenticationException){
return new ClientResponse(header, callback, destination, createdTimeMs, timeMs,
true, null, authenticationException, null);
}
}
中间的部分代码请移步到星球查看。
04 完整请求流程串联
一条完整的请求主要分为以下几个阶段:
- 调用 NetworkClient 的 ready(),连接服务端。
- 调用 NetworkClient 的 poll(),处理连接。
- 调用 NetworkClient 的 newClientRequest(),创建请求 ClientRequest。
- 然后调用 NetworkClient 的 send(),发送请求。
- 最后调用 NetworkClient 的 poll(),处理响应。
04.1 创建连接过程
NetworkClient 发送请求之前,都需要先和 Broker 端创建连接。NetworkClient 负责管理与集群的所有连接。
04.2 生成请求过程
04.3 发送请求过程
04.4 处理响应过程
04.4.1 请求发送完成
04.4.2 请求收到响应
04.4.3 执行处理响应
05 总结
这里,我们一起来总结一下这篇文章的重点。
1、开篇总述消息消息被 Sender 子线程先将消息暂存到 KafkaChannel 的 send 中,等调用「poll方法」执行真正的网络I/O 操作,从而引出了为客户端提供网络 I/O 能力的 「NetworkClient 组件」。
2、带你深度剖析了「NetworkClient 组件」 、「InflightRequests」、「ClusterConnectionState」的实现细节。
3、最后带你串联了整个消息发送请求和处理响应的流程,让你有个更好的整体认知。
文章转载自公众号:华仔聊技术