Name Server主要是用于管理所有的broker信息,让producer和consumer都能获取到正确的broker信息,进行业务处理。

类似于zookeeper的服务治理中心,在旧版本中使用的就是zookeeper,为什么放弃zookeeper?

  • 在RocketMQ中,topic在每个master的数据是对等的,没有哪个master有全部的topic数据,所以选举是没有意义的
  • NameServer与NameServer之间完全没有信息同步,所以与其依赖重量级的zookeeper,不如开发轻量级的NameServer

NameServer的启动过程

解析配置文件

入口类:NamesrvStartup

NamesrvStartup主要负责解析配置文件,创建NamesrvController
org.apache.rocketmq.namesrv.NamesrvStartup

public static void main(String[] args) {
	main0(args);
}

public static NamesrvController main0(String[] args) {

	try {
		// 创建NamesrvController
		NamesrvController controller = createNamesrvController(args);

		// 调用NamesrvController的initialize,并启动netty
		start(controller);
		String tip = "The Name Server boot success. serializeType=" + RemotingCommand.getSerializeTypeConfigInThisServer();
		log.info(tip);
		System.out.printf("%s%n", tip);
		return controller;
	} catch (Throwable e) {
		e.printStackTrace();
		System.exit(-1);
	}

	return null;
}

public static NamesrvController createNamesrvController(String[] args) throws IOException, JoranException {
	System.setProperty(RemotingCommand.REMOTING_VERSION_KEY, Integer.toString(MQVersion.CURRENT_VERSION));
	//PackageConflictDetect.detectFastjson();

	Options options = ServerUtil.buildCommandlineOptions(new Options());
	commandLine = ServerUtil.parseCmdLine("mqnamesrv", args, buildCommandlineOptions(options), new PosixParser());
	if (null == commandLine) {
		System.exit(-1);
		return null;
	}

	final NamesrvConfig namesrvConfig = new NamesrvConfig();
	final NettyServerConfig nettyServerConfig = new NettyServerConfig();
	nettyServerConfig.setListenPort(9876);
	if (commandLine.hasOption('c')) {
		// -c选项,指定配置文件路径
		String file = commandLine.getOptionValue('c');
		if (file != null) {
			InputStream in = new BufferedInputStream(new FileInputStream(file));
			properties = new Properties();
			properties.load(in);
			MixAll.properties2Object(properties, namesrvConfig);
			MixAll.properties2Object(properties, nettyServerConfig);

			namesrvConfig.setConfigStorePath(file);

			System.out.printf("load config properties file OK, %s%n", file);
			in.close();
		}
	}

	if (commandLine.hasOption('p')) {
		// -p选项 只打印系统参数,不启动,直接退出
		InternalLogger console = InternalLoggerFactory.getLogger(LoggerName.NAMESRV_CONSOLE_NAME);
		MixAll.printObjectProperties(console, namesrvConfig);
		MixAll.printObjectProperties(console, nettyServerConfig);
		System.exit(0);
	}

	MixAll.properties2Object(ServerUtil.commandLine2Properties(commandLine), namesrvConfig);

	if (null == namesrvConfig.getRocketmqHome()) {
		System.out.printf("Please set the %s variable in your environment to match the location of the RocketMQ installation%n", MixAll.ROCKETMQ_HOME_ENV);
		System.exit(-2);
	}

	LoggerContext lc = (LoggerContext) LoggerFactory.getILoggerFactory();
	JoranConfigurator configurator = new JoranConfigurator();
	configurator.setContext(lc);
	lc.reset();
	configurator.doConfigure(namesrvConfig.getRocketmqHome() + "/conf/logback_namesrv.xml");

	log = InternalLoggerFactory.getLogger(LoggerName.NAMESRV_LOGGER_NAME);

	MixAll.printObjectProperties(log, namesrvConfig);
	MixAll.printObjectProperties(log, nettyServerConfig);

	final NamesrvController controller = new NamesrvController(namesrvConfig, nettyServerConfig);

	// remember all configs to prevent discard
	controller.getConfiguration().registerConfig(properties);

	return controller;
}

初始化

NamesrvController负责NameServer的初始化,启动Netty服务端。
org.apache.rocketmq.namesrv.NamesrvController#initialize

public boolean initialize() {

	// 加载kv配置
	this.kvConfigManager.load();

	// 创建netty服务端
	this.remotingServer = new NettyRemotingServer(this.nettyServerConfig, this.brokerHousekeepingService);

	this.remotingExecutor =
		Executors.newFixedThreadPool(nettyServerConfig.getServerWorkerThreads(), new ThreadFactoryImpl("RemotingExecutorThread_"));

	// 注册默认的处理类 DefaultRequestProcessor,所有的请求均由该处理类的 processRequest 方法来处理。
	this.registerProcessor();

	// 每隔10s扫描一次Broker,移除不活跃的Broker
	this.scheduledExecutorService.scheduleAtFixedRate(NamesrvController.this.routeInfoManager::scanNotActiveBroker, 5, 10, TimeUnit.SECONDS);

	// 每隔10min打印一次KV配置
	this.scheduledExecutorService.scheduleAtFixedRate(NamesrvController.this.kvConfigManager::printAllPeriodically, 1, 10, TimeUnit.MINUTES);

	if (TlsSystemConfig.tlsMode != TlsMode.DISABLED) {
		// Register a listener to reload SslContext
		try {
			// 实时监听文件的变化
			fileWatchService = new FileWatchService(
				new String[] {
					TlsSystemConfig.tlsServerCertPath,
					TlsSystemConfig.tlsServerKeyPath,
					TlsSystemConfig.tlsServerTrustCertPath
					},
				new FileWatchService.Listener() {
					boolean certChanged, keyChanged = false;
					@Override
					public void onChanged(String path) {
						if (path.equals(TlsSystemConfig.tlsServerTrustCertPath)) {
							log.info("The trust certificate changed, reload the ssl context");
							reloadServerSslContext();
						}
						if (path.equals(TlsSystemConfig.tlsServerCertPath)) {
							certChanged = true;
						}
						if (path.equals(TlsSystemConfig.tlsServerKeyPath)) {
							keyChanged = true;
						}
						if (certChanged && keyChanged) {
							log.info("The certificate and private key changed, reload the ssl context");
							certChanged = keyChanged = false;
							reloadServerSslContext();
						}
					}
					private void reloadServerSslContext() {
						((NettyRemotingServer) remotingServer).loadSslContext();
					}
				});
		} catch (Exception e) {
			log.warn("FileWatchService created error, can't load the certificate dynamically");
		}
	}

	return true;
}

启动Netty服务端。
org.apache.rocketmq.namesrv.NamesrvController#start

public void start() throws Exception {
	// 启动Netty服务端
	this.remotingServer.start();

	if (this.fileWatchService != null) {
		this.fileWatchService.start();
	}
}

broker注册与发现

NameServer使用以下结构来存储信息:

private final HashMap<String/* topic */, Map<String /* brokerName */ , QueueData>> topicQueueTable;
private final HashMap<String/* brokerName */, BrokerData> brokerAddrTable;
private final HashMap<String/* clusterName */, Set<String/* brokerName */>> clusterAddrTable;
private final HashMap<String/* brokerAddr */, BrokerLiveInfo> brokerLiveTable;
private final HashMap<String/* brokerAddr */, List<String>/* Filter Server */> filterServerTable;
  • topicQueueTable:Topic消息队列路由信息,消息发送时根据路由表进行负载均衡
  • brokerAddrTable:Broker基础信息,包括brokerName、所属集群名称、主备Broker地址
  • clusterAddrTable:Broker集群信息,存储集群中所有Broker名称
  • brokerLiveTable:Broker状态信息,NameServer每次收到心跳包时会替换该信息
  • filterServerTable:Broker上的FilterServer列表,用于类模式消息过滤。

NameServer的实现基于内存,NameServer并不会持久化路由信息,持久化的重任是交给Broker来完成。

处理broker的注册请求

如果收到REGISTER_BROKER请求,那么最终会调用到RouteInfoManager.registerBroker。注册完成后,返回给Broker端主用Broker的地址和主用Broker的HA服务地址。

public RegisterBrokerResult registerBroker(
	final String clusterName,
	final String brokerAddr,
	final String brokerName,
	final long brokerId,
	final String haServerAddr,
	final TopicConfigSerializeWrapper topicConfigWrapper,
	final List<String> filterServerList,
	final Channel channel) {
	RegisterBrokerResult result = new RegisterBrokerResult();
	try {
		try {
			this.lock.writeLock().lockInterruptibly();
			/**
                 * broker-a.properties
                 *
                 * brokerClusterName=DefaultCluster
                 * brokerName=broker-a
                 * brokerId=0
                 */
			// clusterAddrTable key为集群名,value为broker集合
			// 若Broker集群名字不在该Map变量中,则初始化一个Set集合,将brokerName存入该Set集合中,
			// 然后以clusterName为key值,该Set集合为values值存入此clusterAddrTable中
			Set<String> brokerNames = this.clusterAddrTable.computeIfAbsent(clusterName, k -> new HashSet<>());
			brokerNames.add(brokerName);

			boolean registerFirst = false;

			// 同一个BrokerName下面可以有多个不同BrokerId的Broker存在
			// 表示一个BrokerName有多个Broker 存在,通过BrokerId来区分主备
			BrokerData brokerData = this.brokerAddrTable.get(brokerName);
			if (null == brokerData) {
				// 第一次注册
				registerFirst = true;
				brokerData = new BrokerData(clusterName, brokerName, new HashMap<>());
				// brokerAddrTable维护Broker基础信息,
				// key为brokerName,value包括brokerName、所属集群名称、主备Broker地址
				this.brokerAddrTable.put(brokerName, brokerData);
			}
			// key为brokerId, value为broker address
			Map<Long, String> brokerAddrsMap = brokerData.getBrokerAddrs();
			//Switch slave to master: first remove <1, IP:PORT> in namesrv, then add <0, IP:PORT>
			//The same IP:PORT must only have one record in brokerAddrTable
			Iterator<Entry<Long, String>> it = brokerAddrsMap.entrySet().iterator();
			while (it.hasNext()) {
				Entry<Long, String> item = it.next();
				if (null != brokerAddr && brokerAddr.equals(item.getValue()) && brokerId != item.getKey()) {
					// brokerId变了直接移除,可能是slave升级为master
					log.debug("remove entry {} from brokerData", item);
					it.remove();
				}
			}

			String oldAddr = brokerData.getBrokerAddrs().put(brokerId, brokerAddr);
			if (MixAll.MASTER_ID == brokerId) {
				log.info("cluster [{}] brokerName [{}] master address change from {} to {}",
						 brokerData.getCluster(), brokerData.getBrokerName(), oldAddr, brokerAddr);
			}

			registerFirst = registerFirst || (null == oldAddr);

			// 若Broker的注册请求消息中topic的配置不为空,并且该Broker是主用(即brokerId=0)
			if (null != topicConfigWrapper
				&& MixAll.MASTER_ID == brokerId) {
				// 则根据NameServer存储的Broker版本信息来判断是否需要更新NameServer端的topic配置信息
				if (this.isBrokerTopicConfigChanged(brokerAddr, topicConfigWrapper.getDataVersion())
					|| registerFirst) {
					ConcurrentMap<String, TopicConfig> tcTable =
						topicConfigWrapper.getTopicConfigTable();
					if (tcTable != null) {
						for (Map.Entry<String, TopicConfig> entry : tcTable.entrySet()) {
							// 更新broker的队列信息
							this.createAndUpdateQueueData(brokerName, entry.getValue());
						}
					}
				}
			}

			// brokerLiveTable存储broker的心跳信息,key为broker地址,value存储心跳的时间
			BrokerLiveInfo prevBrokerLiveInfo = this.brokerLiveTable.put(brokerAddr,
																		 new BrokerLiveInfo(
																			 System.currentTimeMillis(),
																			 topicConfigWrapper.getDataVersion(),
																			 channel,
																			 haServerAddr));
			if (null == prevBrokerLiveInfo) {
				log.info("new broker registered, {} HAServer: {}", brokerAddr, haServerAddr);
			}

			// 对于filterServerList不为空的, 以broker地址为key值存入
			if (filterServerList != null) {
				if (filterServerList.isEmpty()) {
					this.filterServerTable.remove(brokerAddr);
				} else {
					this.filterServerTable.put(brokerAddr, filterServerList);
				}
			}

			// 找到该BrokerName下面的主Broker(BrokerId=0)
			if (MixAll.MASTER_ID != brokerId) {
				String masterAddr = brokerData.getBrokerAddrs().get(MixAll.MASTER_ID);
				if (masterAddr != null) {
					// 根据主Broker地址从brokerLiveTable中获取BrokerLiveInfo对象,取该对象的HaServerAddr值
					BrokerLiveInfo brokerLiveInfo = this.brokerLiveTable.get(masterAddr);
					if (brokerLiveInfo != null) {
						result.setHaServerAddr(brokerLiveInfo.getHaServerAddr());
						result.setMasterAddr(masterAddr);
					}
				}
			}
		} finally {
			this.lock.writeLock().unlock();
		}
	} catch (Exception e) {
		log.error("registerBroker Exception", e);
	}

	return result;
}

根据Topic获取Broker信息和topic配置信息

接收到GET_ROUTEINTO_BY_TOPIC请求之后,间接调用了RouteInfoManager.pickupTopicRouteData方法来获取Broker和topic信息。

public TopicRouteData pickupTopicRouteData(final String topic) {
	TopicRouteData topicRouteData = new TopicRouteData();
	boolean foundQueueData = false;
	boolean foundBrokerData = false;
	Set<String> brokerNameSet = new HashSet<>();
	List<BrokerData> brokerDataList = new LinkedList<>();
	topicRouteData.setBrokerDatas(brokerDataList);

	HashMap<String, List<String>> filterServerMap = new HashMap<>();
	topicRouteData.setFilterServerTable(filterServerMap);

	try {
		try {
			this.lock.readLock().lockInterruptibly();
			// topicQueueTable key为topic,value为queueDataMap
			// queueDataMap key为brokerName,value为队列信息
			Map<String, QueueData> queueDataMap = this.topicQueueTable.get(topic);
			if (queueDataMap != null) {
				topicRouteData.setQueueDatas(new ArrayList<>(queueDataMap.values()));
				foundQueueData = true;

				brokerNameSet.addAll(queueDataMap.keySet());

				for (String brokerName : brokerNameSet) {
					BrokerData brokerData = this.brokerAddrTable.get(brokerName);
					if (null != brokerData) {
						BrokerData brokerDataClone = new BrokerData(brokerData.getCluster(), brokerData.getBrokerName(), (HashMap<Long, String>) brokerData
																	.getBrokerAddrs().clone());
						brokerDataList.add(brokerDataClone);
						foundBrokerData = true;

						// skip if filter server table is empty
						if (!filterServerTable.isEmpty()) {
							for (final String brokerAddr : brokerDataClone.getBrokerAddrs().values()) {
								List<String> filterServerList = this.filterServerTable.get(brokerAddr);

								// only add filter server list when not null
								if (filterServerList != null) {
									filterServerMap.put(brokerAddr, filterServerList);
								}
							}
						}
					}
				}
			}
		} finally {
			this.lock.readLock().unlock();
		}
	} catch (Exception e) {
		log.error("pickupTopicRouteData Exception", e);
	}

	log.debug("pickupTopicRouteData {} {}", topic, topicRouteData);

	if (foundBrokerData && foundQueueData) {
		return topicRouteData;
	}

	return null;
}

因为Broker每隔30s向NameServer发送一个心跳包,这个操作每次都会更新Broker的状态,但同时生产者发送消息时也需要Broker的状态,要进行频繁的读取操作。所以这个地方就有一个矛盾,Broker的状态会被经常性的更新,同时也会被更加频繁的读取。这里如何提高并发,尤其是生产者进行消息发送时的并发,所以这里使用了读写锁机制(针对读多写少的场景)。

NameServer每收到一个心跳包,将更新brokerLiveTable中关于Broker的状态信息以及路由表(topicQueueTable、brokerAddrTable、brokerLiveTable、filterServerTable)。更新上述路由表使用了锁粒度较少的读写锁,允许多个消息发送者(Producer)并发读保证消息发送时的高并发。但同一时刻NameServer只处理一个Broker心跳包,多个心跳包请求串行执行,这也是读写锁经典使用场景。

路由剔除机制

Broker每隔30s向NameServer发送一个心跳包,心跳包包含BrokerId,Broker地址,Broker名称,Broker所属集群名称、Broker关联的FilterServer列表。

但是如果Broker宕机,NameServer无法收到心跳包,此时NameServer如何来剔除这些失效的Broker呢?NameServer会每隔10s扫描brokerLiveTable状态表,如果BrokerLive的lastUpdateTimestamp的时间戳距当前时间超过120s,则认为Broker失效,移除该Broker,关闭与Broker连接,同时更新topicQueueTable、brokerAddrTable、brokerLiveTable、filterServerTable。

【RocketMQ】源码之NameServer_java

RocketMQ有两个触发点来删除路由信息:

  • NameServer定期扫描brokerLiveTable检测上次心跳包与当前系统的时间差,如果时间超过120s,则需要移除broker。
  • Broker在正常关闭的情况下,会执行unregisterBroker指令。

这两种方式路由删除的方法都是一样的,都是从相关路由表中删除与该broker相关的信息。

public int scanNotActiveBroker() {
	int removeCount = 0;
	Iterator<Entry<String, BrokerLiveInfo>> it = tis.brokerLiveTable.entrySet().iterator();
	while (it.hasNext()) {
		Entry<String, BrokerLiveInfo> next = it.next();
		long last = next.getValue().getLastUpdateTimestamp();
		// 超过120s未发送心跳包,剔除
		if ((last + BROKER_CHANNEL_EXPIRED_TIME) < System.currentTimeMillis()) {
			RemotingUtil.closeChannel(next.getValue().getChannel());
			it.remove();
			log.warn("The broker channel expired, {} {}ms", next.getKey(), BROKER_CHANNEL_EXPIRED_TIME);
			this.onChannelDestroy(next.getKey(), next.getValue().getChannel());

			removeCount++;
		}
	}

	return removeCount;
}