此文已由作者赵计刚授权网易云社区发布。


欢迎访问​​网易云社区​​,了解更多网易技术产品运营经验。



一、provider端心跳机制


-->openServer(URL url)                  url:dubbo://10.10.10.10:20880/com.alibaba.dubbo.demo.DemoService?anyhost=true&application=demo-provider&bind.ip=10.10.10.10&bind.port=20880&default.server=netty4&dubbo=2.0.0&generic=false&interface=com.alibaba.dubbo.demo.DemoService&methods=sayHello&pid=21999&qos.port=22222&side=provider&timestamp=1520660491836                 -->createServer(URL url)                     -->HeaderExchanger.bind(URL url, ExchangeHandler handler)                        url:dubbo://10.10.10.10:20880/com.alibaba.dubbo.demo.DemoService?anyhost=true&application=demo-provider&bind.ip=10.10.10.10&bind.port=20880&channel.readonly.sent=true&codec=dubbo&default.server=netty4&dubbo=2.0.0&generic=false&heartbeat=60000&interface=com.alibaba.dubbo.demo.DemoService&methods=sayHello&pid=21999&qos.port=22222&side=provider&timestamp=1520660491836 handler:DubboProtocol.requestHandler                       -->new DecodeHandler(new HeaderExchangeHandler(handler)))                         -->NettyTransporter.bind(URL url, ChannelHandler listener)                            listener:上边的DecodeHandler实例                           -->new NettyServer(URL url, ChannelHandler handler)                             -->ChannelHandler.wrapInternal(ChannelHandler handler, URL url)                                handler:上边的DecodeHandler实例                             -->doOpen()//开启netty服务                       -->new HeaderExchangeServer(Server server)                          server:上述的NettyServer                         -->startHeatbeatTimer()


服务端在开启netty服务时, 在调用createServer时,会从url的parameters map中获取heartbeat配置,代码如下:


1      private ExchangeServer createServer(URL url) {  2   3         ...  4   5         url = url.addParameterIfAbsent(Constants.HEARTBEAT_KEY, String.valueOf(Constants.DEFAULT_HEARTBEAT));  6          7         ...  8   9         ExchangeServer server; 10         try { 11             server = Exchangers.bind(url, requestHandler); 12         } catch (RemotingException e) { 13             throw new RpcException("Fail to start server(url: " + url + ") " + e.getMessage(), e); 14         } 15  16         ... 17  18         return server; 19     }


其中:int DEFAULT_HEARTBEAT = 60 * 1000,即当用户没有配置heartbeat(心跳时间)时,默认heartbeat=60s(即60s内没有接收到任何请求,就会发送心跳信息)。那么这个heartbeat到底该怎么配?

provider端:

1     <dubbo:service ...> 2         <dubbo:parameter key="heartbeat" value="3000"/> 3     </dubbo:service>

consumer端:

1     <dubbo:reference ...> 2         <dubbo:parameter key="heartbeat" value="3000"/> 3     </dubbo:reference>

再来看调用链,当执行到这一句。

1 ChannelHandler.wrapInternal(ChannelHandler handler, URL url)

会形成一个handler调用链,调用链如下:


1 MultiMessageHandler 2 -->handler: HeartbeatHandler 3    -->handler: AllChannelHandler 4          -->url: providerUrl 5          -->executor: FixedExecutor 6          -->handler: DecodeHandler 7             -->handler: HeaderExchangeHandler 8                -->handler: ExchangeHandlerAdapter(DubboProtocol.requestHandler)


这也是netty接收到请求后的处理链路,注意其中有一个HeartbeatHandler。

最后,执行new HeaderExchangeServer(Server server),来看源码:


1 public class HeaderExchangeServer implements ExchangeServer {  2     /** 心跳定时器 */  3     private final ScheduledExecutorService scheduled = Executors.newScheduledThreadPool(1,  4             new NamedThreadFactory(  5                     "dubbo-remoting-server-heartbeat",  6                     true));  7     /** NettyServer */  8     private final Server server;  9     // heartbeat timer 10     private ScheduledFuture<?> heatbeatTimer; 11     // heartbeat timeout (ms), default value is 0 , won't execute a heartbeat. 12     private int heartbeat; 13     private int heartbeatTimeout; 14     private AtomicBoolean closed = new AtomicBoolean(false); 15  16     public HeaderExchangeServer(Server server) { 17         if (server == null) { 18             throw new IllegalArgumentException("server == null"); 19         } 20         this.server = server; 21         this.heartbeat = server.getUrl().getParameter(Constants.HEARTBEAT_KEY, 0); 22         this.heartbeatTimeout = server.getUrl().getParameter(Constants.HEARTBEAT_TIMEOUT_KEY, heartbeat * 3); 23         if (heartbeatTimeout < heartbeat * 2) { 24             throw new IllegalStateException("heartbeatTimeout < heartbeatInterval * 2"); 25         } 26         startHeatbeatTimer(); 27     } 28  29     private void startHeatbeatTimer() { 30         stopHeartbeatTimer(); 31         if (heartbeat > 0) { 32             heatbeatTimer = scheduled.scheduleWithFixedDelay( 33                     new HeartBeatTask(new HeartBeatTask.ChannelProvider() { 34                         public Collection<Channel> getChannels() { 35                             return Collections.unmodifiableCollection( 36                                     HeaderExchangeServer.this.getChannels()); 37                         } 38                     }, heartbeat, heartbeatTimeout), 39                     heartbeat, heartbeat, TimeUnit.MILLISECONDS); 40         } 41     } 42  43     private void stopHeartbeatTimer() { 44         try { 45             ScheduledFuture<?> timer = heatbeatTimer; 46             if (timer != null && !timer.isCancelled()) { 47                 timer.cancel(true); 48             } 49         } catch (Throwable t) { 50             logger.warn(t.getMessage(), t); 51         } finally { 52             heatbeatTimer = null; 53         } 54     } 55 }


创建HeaderExchangeServer时,初始化了heartbeat(心跳间隔时间)和heartbeatTimeout(心跳响应超时时间:即如果最终发送的心跳在这个时间内都没有返回,则做出响应的处理)。


  • heartbeat默认是0(从startHeatbeatTimer()方法可以看出只有heartbeat>0的情况下,才会发心跳,这里heartbeat如果从url的parameter map中获取不到,就是0,但是我们在前边看到dubbo会默认设置heartbeat=60s到parameter map中,所以此处的heartbeat=60s);
  • heartbeatTimeout:默认是heartbeat*3。(原因:假设一端发出一次heartbeatRequest,另一端在heartbeat内没有返回任何响应-包括正常请求响应和心跳响应,此时不能认为是连接断了,因为有可能还是网络抖动什么的导致了tcp包的重传超时等)
  • scheduled是一个含有一个线程的定时线程执行器(其中的线程名字为:"dubbo-remoting-server-heartbeat-thread-*")

之后启动心跳定时任务:


  • 首先如果原来有心跳定时任务,关闭原来的定时任务
  • 之后启动scheduled中的定时线程,从启动该线程开始,每隔heartbeat执行一次HeartBeatTask任务(第一次执行是在启动线程后heartbeat时)

来看一下HeartBeatTask的源码:


1 final class HeartBeatTask implements Runnable {  2     // channel获取器:用于获取所有需要进行心跳检测的channel  3     private ChannelProvider channelProvider;  4     private int heartbeat;  5     private int heartbeatTimeout;  6   7     HeartBeatTask(ChannelProvider provider, int heartbeat, int heartbeatTimeout) {  8         this.channelProvider = provider;  9         this.heartbeat = heartbeat; 10         this.heartbeatTimeout = heartbeatTimeout; 11     } 12  13     public void run() { 14         try { 15             long now = System.currentTimeMillis(); 16             for (Channel channel : channelProvider.getChannels()) { 17                 if (channel.isClosed()) { 18                     continue; 19                 } 20                 try { 21                     // 获取最后一次读操作的时间 22                     Long lastRead = (Long) channel.getAttribute( 23                             HeaderExchangeHandler.KEY_READ_TIMESTAMP); 24                     // 获取最后一次写操作的时间 25                     Long lastWrite = (Long) channel.getAttribute( 26                             HeaderExchangeHandler.KEY_WRITE_TIMESTAMP);27                     // 如果在heartbeat内没有进行读操作或者写操作,则发送心跳请求 28                     if ((lastRead != null && now - lastRead > heartbeat) 29                             || (lastWrite != null && now - lastWrite > heartbeat)) { 30                         Request req = new Request(); 31                         req.setVersion("2.0.0"); 32                         req.setTwoWay(true); 33                         req.setEvent(Request.HEARTBEAT_EVENT); 34                         channel.send(req); 35                         if (logger.isDebugEnabled()) { 36                             logger.debug("Send heartbeat to remote channel " + channel.getRemoteAddress() 37                                     + ", cause: The channel has no data-transmission exceeds a heartbeat period: " + heartbeat + "ms"); 38                         } 39                     } 40                     //正常消息和心跳在heartbeatTimeout都没接收到 41                     if (lastRead != null && now - lastRead > heartbeatTimeout) { 42                         logger.warn("Close channel " + channel 43                                 + ", because heartbeat read idle time out: " + heartbeatTimeout + "ms"); 44                         // consumer端进行重连 45                         if (channel instanceof Client) { 46                             try { 47                                 ((Client) channel).reconnect(); 48                             } catch (Exception e) { 49                                 //do nothing 50                             } 51                         } else {// provider端关闭连接 52                             channel.close(); 53                         } 54                     } 55                 } catch (Throwable t) { 56                     logger.warn("Exception when heartbeat to remote channel " + channel.getRemoteAddress(), t); 57                 } 58             } 59         } catch (Throwable t) { 60             logger.warn("Unhandled exception when heartbeat, cause: " + t.getMessage(), t); 61         } 62     } 63  64     interface ChannelProvider { 65         Collection<Channel> getChannels(); 66     } 67 }


HeartBeatTask首先获取所有的channelProvider#getChannels获取所有需要心跳检测的channel,channelProvider实例是HeaderExchangeServer中在启动线程定时执行器的时候创建的内部类。

1                     new HeartBeatTask.ChannelProvider() { 2                         public Collection<Channel> getChannels() { 3                             return Collections.unmodifiableCollection( 4                                     HeaderExchangeServer.this.getChannels()); 5                         } 6                     }