上一篇分析了neutron wsgi应用的源码,这一篇分析另外一部分核心功能,rpc篇,同时分析一下neutron采用的并发模型。
还是上篇的代码,启动完wsgi后,启动rpc_workers。
neutron/server/wsgi_eventlet.py:
def eventlet_wsgi_server():
neutron_api = service.serve_wsgi(service.NeutronApiService)
start_api_and_rpc_workers(neutron_api)
可以看到neutron默认采用的并发模型是eventlet,eventlet是一个并发网络库,底层主要是通过epoll机制来实现非阻塞I/O操作,并提供了协程机制。关于eventlet不是本篇重点,以后专门写一篇介绍eventlet的。这里只知道使用它来实现并发即可。
这里提一下neutron采用的并发模型,neutron采用的是多进程加GreenPool的并发模型。wsgi app,rpc分别fork不同的子进程来执行,在每个子进程内部通过eventlet提供的GreenPool来提高吞吐量,可以理解为线程池(实际上是GreenThread,协程)。后面详细讲解这个过程。
WSGI服务启动涉及的类的关系图如下:
对照serve_wsgi函数来讲解上面的类图,边结合类图边看代码可以理的更清楚一些:
neutron/service.py:
def serve_wsgi(cls):
try:
service = cls.create()
service.start()
except Exception:
with excutils.save_and_reraise_exception():
LOG.exception(_LE('Unrecoverable error: please check log '
'for details.'))
return service
创建一个NeutronApiService对象,从类图可以看到这个类是WsgiService的子类,然后调用WsgiService的start方法启动服务:
service = cls.create() service.start()
start方法中会使用oslo_service.wsgi中的Loader来加载wsgi app,这个Loader实际上会使用上一篇中讲到的paste.deploy来加载app。
接着来看下其父类WsgiService的start方法:
neutron/service.py:
class WsgiService(object):
"""Base class for WSGI based services.
For each api you define, you must also define these flags:
:<api>_listen: The address on which to listen
:<api>_listen_port: The port on which to listen
"""
def __init__(self, app_name):
self.app_name = app_name
self.wsgi_app = None
def start(self):
self.wsgi_app = _run_wsgi(self.app_name)
在start方法中调用_run_wsgi:
def run_wsgi_app(app):
server = wsgi.Server("Neutron")
server.start(app, cfg.CONF.bind_port, cfg.CONF.bind_host,
workers=_get_api_workers())
LOG.info(_LI("Neutron service started, listening on %(host)s:%(port)s"),
{'host': cfg.CONF.bind_host, 'port': cfg.CONF.bind_port})
return server
结合类图可以看到WsgiService会声明一个neutron.wsgi::Server对象,这个对象内部会使用eventlet.GreenPool这个GreenThread池。然后调用其start方法。
neutron/wsgi.py:
def start(self, application, port, host='0.0.0.0', workers=0):
"""Run a WSGI server with the given application."""
self._host = host
self._port = port
backlog = CONF.backlog
self._socket = self._get_socket(self._host,
self._port,
backlog=backlog)
self._launch(application, workers)
def _launch(self, application, workers=0):
service = WorkerService(self, application, self.disable_ssl)
if workers < 1:
# The API service should run in the current process.
self._server = service
# Dump the initial option values
cfg.CONF.log_opt_values(LOG, logging.DEBUG)
service.start()
systemd.notify_once()
else:
# dispose the whole pool before os.fork, otherwise there will
# be shared DB connections in child processes which may cause
# DB errors.
api.dispose()
# The API service runs in a number of child processes.
# Minimize the cost of checking for child exit by extending the
# wait interval past the default of 0.01s.
self._server = common_service.ProcessLauncher(cfg.CONF,
wait_interval=1.0)
self._server.launch_service(service, workers=workers)
- 结合类图和_launch代码可知,Server对象会把application封装成一个WorkerService,然后使用oslo_service.service中提供的ProcessLanucher对象来启动WokerService。
封装成WorkerService
service = WorkerService(self, application, self.disable_ssl)
调用ProcessLauncher运行service:
self._server = common_service.ProcessLauncher(cfg.CONF,
wait_interval=1.0)
self._server.launch_service(service, workers=workers)
- ProcessLauncher的主要作用是根据workers数量来fork不同个数个子进程,再在每个子进程中启动WorkerService。
根据workers数量来创建不同个数的子进程来运行service:
def launch_service(self, service, workers=1):
"""Launch a service with a given number of workers.
:param service: a service to launch, must be an instance of
:class:`oslo_service.service.ServiceBase`
:param workers: a number of processes in which a service
will be running
"""
_check_service_base(service)
wrap = ServiceWrapper(service, workers)
LOG.info(_LI('Starting %d workers'), wrap.workers)
while self.running and len(wrap.children) < wrap.workers:
self._start_child(wrap)
- WorkerService在启动过程中会使用Server对象的GreenPool来spawn一个GreenThread来调用eventlet.wsgi.server运行我们的app.这样就最终运行起来了wsgi app服务并对外提供restful API.
neutron/wsgi.py:
可以看到start方法中会调用self._service也就是Server对象的pool.spawn来运行Server的_run方法:
class WorkerService(worker.NeutronWorker):
def start(self):
super(WorkerService, self).start()
# When api worker is stopped it kills the eventlet wsgi server which
# internally closes the wsgi server socket object. This server socket
# object becomes not usable which leads to "Bad file descriptor"
# errors on service restart.
# Duplicate a socket object to keep a file descriptor usable.
dup_sock = self._service._socket.dup()
if CONF.use_ssl and not self._disable_ssl:
dup_sock = sslutils.wrap(CONF, dup_sock)
self._server = self._service.pool.spawn(self._service._run,
self._application,
dup_sock)
self._service._run即为Server的_run方法:
def _run(self, application, socket):
"""Start a WSGI server in a new green thread."""
eventlet.wsgi.server(socket, application,
max_size=self.num_threads,
log=LOG,
keepalive=CONF.wsgi_keep_alive,
socket_timeout=self.client_socket_timeout)
- 默认情况下,workers配置为1,因此会创建一个子进程来提供restfulAPI服务,这个子进程中的eventlet.wsgi.server最终会运行在一个GreenThread中。
通过上面WSGI server的启动,我们知道了neutron使用进程+GreenPool的方式来运行服务,后面运行rpc服务也是使用上面这种架构。我们也知道了关键对象ProcessLauncher是通过创建进程的方式来启动服务的。ProcessLauncher启动的service需要是oslo_service.service::ServiceBase的子类并实现start方法。
有了上面的基础,再分析rpc的启动过程就容易了。
def start_api_and_rpc_workers(neutron_api):
pool = eventlet.GreenPool()
api_thread = pool.spawn(neutron_api.wait)
try:
neutron_rpc = service.serve_rpc()
except NotImplementedError:
LOG.info(_LI("RPC was already started in parent process by "
"plugin."))
else:
rpc_thread = pool.spawn(neutron_rpc.wait)
plugin_workers = service.start_plugin_workers()
for worker in plugin_workers:
pool.spawn(worker.wait)
# api and rpc should die together. When one dies, kill the other.
rpc_thread.link(lambda gt: api_thread.kill())
api_thread.link(lambda gt: rpc_thread.kill())
pool.waitall()
主进程中使用GreenPool来运行neutron_api,neutron_rpc的wait方法,并调用waitall方法等待2个GreenThread结束,实际上这意味着主进程只是等待wsgi API,rpc两个子进程结束而已。其中的link方法是确保只要rpc,api有一个服务挂掉就结束另外一个服务。
我们重点分析neutron_rpc的创建过程:
neutron/service.py:
def serve_rpc():
plugin = manager.NeutronManager.get_plugin()
service_plugins = (
manager.NeutronManager.get_service_plugins().values())
if cfg.CONF.rpc_workers < 1:
cfg.CONF.set_override('rpc_workers', 1)
# If 0 < rpc_workers then start_rpc_listeners would be called in a
# subprocess and we cannot simply catch the NotImplementedError. It is
# simpler to check this up front by testing whether the plugin supports
# multiple RPC workers.
if not plugin.rpc_workers_supported():
LOG.debug("Active plugin doesn't implement start_rpc_listeners")
if 0 < cfg.CONF.rpc_workers:
LOG.error(_LE("'rpc_workers = %d' ignored because "
"start_rpc_listeners is not implemented."),
cfg.CONF.rpc_workers)
raise NotImplementedError()
try:
# passing service plugins only, because core plugin is among them
rpc = RpcWorker(service_plugins)
# dispose the whole pool before os.fork, otherwise there will
# be shared DB connections in child processes which may cause
# DB errors.
LOG.debug('using launcher for rpc, workers=%s', cfg.CONF.rpc_workers)
session.dispose()
launcher = common_service.ProcessLauncher(cfg.CONF, wait_interval=1.0)
launcher.launch_service(rpc, workers=cfg.CONF.rpc_workers)
if (cfg.CONF.rpc_state_report_workers > 0 and
plugin.rpc_state_report_workers_supported()):
rpc_state_rep = RpcReportsWorker([plugin])
LOG.debug('using launcher for state reports rpc, workers=%s',
cfg.CONF.rpc_state_report_workers)
launcher.launch_service(
rpc_state_rep, workers=cfg.CONF.rpc_state_report_workers)
return launcher
except Exception:
with excutils.save_and_reraise_exception():
LOG.exception(_LE('Unrecoverable error: please check log for '
'details.'))
plugin = manager.NeutronManager.get_plugin()
这个NeutronManager上篇也提到过,它主要是通过配置文件来加载初始化正确的插件,如M2lPlugin,这里调用其类方法get_plugin()获取配置的核心插件保证NeutronManager是个单例类。plugin即为"Ml2Plugin"。
service_plugins = ( manager.NeutronManager.get_service_plugins().values())
然后获取所有的service_plugins,这个上篇中也讲到过,最终会获取到以下6个插件实例:
'neutron.plugins.ml2.plugin.Ml2Plugin'
'neutron.services.network_ip_availability.plugin.NetworkIPAvailabilityPlugin'
'neutron.services.auto_allocate.plugin.Plugin'
'neutron.services.timestamp.timestamp_plugin.TimeStampPlugin'
'neutron.services.tag.tag_plugin.TagPlugin'
'neutron.services.l3_router.l3_router_plugin.L3RouterPlugin'
if cfg.CONF.rpc_workers < 1:
cfg.CONF.set_override('rpc_workers', 1)
然后从配置中获取配置的rpc_worker数量,默认为1。通过上面的分析可知,这个决定了后面ProcessLauncher启动几个子进程来提供服务。
if not plugin.rpc_workers_supported():
LOG.debug("Active plugin doesn't implement start_rpc_listeners")
if 0 < cfg.CONF.rpc_workers:
LOG.error(_LE("'rpc_workers = %d' ignored because "
"start_rpc_listeners is not implemented."),
cfg.CONF.rpc_workers)
raise NotImplementedError()
然后判断核心插件(这里是Ml2Plugin)是否实现了start_rpc_listeners方法,如果没有实现则报错。
rpc = RpcWorker(service_plugins)
然后创建了一个RpcWorker,这个和上面讲到的neutron.wsgi:WorkerService的作用一样,也是继承ServiceBase的子类NeutronWorker,并重写start方法,来交于ProcessLauncher运行。因此其start方法就是服务启动的关键代码:
neutron/service.py:
class RpcWorker(worker.NeutronWorker):
"""Wraps a worker to be handled by ProcessLauncher"""
start_listeners_method = 'start_rpc_listeners'
def __init__(self, plugins):
self._plugins = plugins
self._servers = []
def start(self):
super(RpcWorker, self).start()
for plugin in self._plugins:
if hasattr(plugin, self.start_listeners_method):
try:
servers = getattr(plugin, self.start_listeners_method)()
except NotImplementedError:
continue
self._servers.extend(servers)
可以看到,会遍历所有的service_plugins,也就是上面讲的6个插件,查看插件是否实现了"start_rpc_listeners"方法,如果实现了则调用之。这就是RpcWorker的作用。这些插件的start_rpc_listeners方法中就完成了rpc的功能,主要是通过消费特定名称的mq队列消息来提供服务。
launcher = common_service.ProcessLauncher(cfg.CONF, wait_interval=1.0)
launcher.launch_service(rpc, workers=cfg.CONF.rpc_workers)
这样就会通过ProcessLauncher来创建了workers个子进程(默认为1)提供RPC服务,具体的rpc功能实现交给插件的"start_rpc_listeners"方法去实现。
if (cfg.CONF.rpc_state_report_workers > 0 and
plugin.rpc_state_report_workers_supported()):
rpc_state_rep = RpcReportsWorker([plugin])
LOG.debug('using launcher for state reports rpc, workers=%s',
cfg.CONF.rpc_state_report_workers)
launcher.launch_service(
rpc_state_rep, workers=cfg.CONF.rpc_state_report_workers)
然后判断是否配置了rpc_state_report_workers,如果配置了则再启动指定个子进程运行RpcReportWorker,这个Worker也是继承自ServiceBase并重写了start方法。最终的rpc功能交由插件的'start_rpc_state_reports_listener'方法去实现。
plugin_workers = service.start_plugin_workers()
for worker in plugin_workers:
pool.spawn(worker.wait)
def start_plugin_workers():
launchers = []
# NOTE(twilson) get_service_plugins also returns the core plugin
for plugin in manager.NeutronManager.get_unique_service_plugins():
# TODO(twilson) Instead of defaulting here, come up with a good way to
# share a common get_workers default between NeutronPluginBaseV2 and
# ServicePluginBase
for plugin_worker in getattr(plugin, 'get_workers', tuple)():
print("Plugin start_worker",plugin,plugin_worker)
launcher = common_service.ProcessLauncher(cfg.CONF)
launcher.launch_service(plugin_worker)
launchers.append(launcher)
return launchers
最后是调用所有插件的'get_workers'方法,这个方法用于插件定义自己的ServiceBase来提供自己的个性化服务,如果有自定义的ServiceBase,最终也会交由ProcessLauncher去创建进程启动服务。
这样,整个neutron就启动完成了,可以看到rpc,wsgi都是通过封装继承自ServiceBase并交由ProcessLauncher创建进程去启动,并通过钩子函数方便插件自定义需要的服务。如果默认配置,最终会有3个子进程,分别提供wsgi api,rpc,rpc_state_reports服务。
主进程,通过GreenPool等待所有子进程结束:
eutron 1348
3个子进程分别提供不同的服务:
neutron 3275 1348
neutron 3276 1348
neutron 3277 1348 0 16:03 ? 00:00:22 /usr/bin/python /usr/bin/neutron-server --config-file=/etc/neutron/neutron.conf --config-file=/etc/neutron/plugins/ml2/ml2_conf.ini --log-file=/var/log/neutron/neutron-server.log