Background on http frontends

-civetweb frontend

--thread per connection, requires a lot of threads
---qos or priority queuing would block frontend threads

-beast frontend

--boost::beast for http parsing
--boost::asio for async networking/io
--async for accepting connections and reading headers
---good model for qos - can queue requests without blocking threads
--synchronous call to process_request()
---thread per request, still need lots of threads

goal: scale requests independently of threads

-why boost::asio

--doesn't impose a threading model. io_service object is a reactor, call run() from any thread
--mature library, basis for C++ std::net library in Networking TS [1]
--the Extensible Asynchronous Model [2] provides several options for async primitives (callbacks, futures, coroutines)
--boost::asio::spawn() stackful coroutines: "enables programs to implement asynchronous logic in a synchronous manner" 

proposed librados interfaces for asio [4 https://github.com/ceph/ceph/pull/19054]

--header-only wrapper over librados c++ api
--conform to the Extensible Asynchronous Model, so support the same primitives - see unit tests for examples
--deeper Objecter integration work in progress by Adam Emerson [5]
--gives radosgw a unified interface for async operations over http and rados

async process_request() [6]

-add optional yield_context* argument to process_request()

-beast frontend passes one, civetweb passes nullptr

-any librados calls use new interface when given a yield_context

-requires the yield_context* to be passed everywhere in between

--but we can stash it in req_state to make it available to all ops

-getting started with the easy stuff

--rgw_get_system_obj()
--reading user objects for authentication
--reading bucket/bucket instance objects (common to most s3/swift ops)

-this process leaves a lot of gaps. for example, rgw_get_system_obj() is in tons of call paths without access to a yield_context

--(either outside process_request(), or just aren't hooked up yet)
--just passing 'nullptr' makes it impossible to differentiate the yield_context argument from its 4 other arguments that default to nullptr!
--that makes it impossible to reason about which call paths could run asynchronously

-measurable progress towards full asynchrony

--new vocabulary type 'optional_yield_context' with 'null_yield' for empty value
--null_yield designates a call site that is definitely synchronous
--makes it easy to audit the code and find the pieces that still need conversion

-fighting regression once we're close

--have librados calls log warnings when called synchronously from a beast frontend thread (using a thread_local flag)
--scan those logs in teuthology runs to flag failures

and then?

-vastly reduce the number of frontend threads for beast

-consolidate other background threads

remaining work:

- RGWGetObj waits on AioCompletions - use AioThrottle from PutObj instead

- replace librados IoCtx::operate() calls with rgw_rados_operate() and optional_yield_context

- thread optional_yield_context all the way from beast frontend to rgw_rados_operate() calls
- some cls client calls use IoCtx::operate() directly

- block_while_resharding() sleeps on a condition variable

- no async interface for pool object listings with IoCtx.nobjects_begin()

- libcurl http requests for auth (Keystone and OPA)

[1] "C++ Technical Specification - Extensions for Networking"
http://cplusplus.github.io/networking-ts/draft.pdf
[2] "Library Foundations for Asynchronous Operations"
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n3896.pdf
[3] Reference: boost::asio::spawn
http://www.boost.org/doc/libs/1_65_1/doc/html/boost_asio/reference/spawn.html
[4] "librados: add async interfaces for use with boost::asio"
https://github.com/ceph/ceph/pull/19054
[5] "osdc/Objecter: Boost.Asio (I object!)"
https://github.com/ceph/ceph/pull/16715
[6] work in progress branch:
https://github.com/cbodley/ceph/commits/wip-rgw-async-process-171120