I/O Completion Ports (IOCP) supported on Microsoft Windows platforms has two facets. It first allows I/O handles like file handles, socket handles, etc., to be associated with a completion port. Any async I/O completion event related to the I/O handle associated with the IOCP will get queued onto this completion port. This allows threads to wait on the IOCP for any completion events. The second facet is that we can create a I/O completion port that is not associated with any I/O handle. In this case, the IOCP is purely used as a mechanism for efficiently providing a thread-safe waitable queue technique. This technique is interesting and efficient. Using this technique, a pool of a few threads can achieve good scalability and performance for an application. Here is a small example. For instance, if you are implementing a HTTP server application, then you need to do the following mundane tasks apart from the protocol implementation:
引用
Create a client connection listen socket.
Once we get the client connection, use the client socket to communicate with the client to and fro.
3. Using Managed IOCP in .NET applications
Multithreaded systems are complex in the context that most problems will show up in real time production scenarios. To limit the possibility of such surprises while using Managed IOCP, I created a test application using which several aspects of the Managed IOCP library can be tested. Nevertheless, I look forward for any suggestions/corrections/inputs to improve this library and its demo application.
Before getting into the demo application, below is the sequence of steps that an application would typically perform while using the Managed IOCP library:
Create an instance of the ManagedIOCP class:
using Sonic.Net;
ManagedIOCP mIOCP = new ManagedIOCP();
The ManagedIOCP constructor takes one argument, concurrentThreads. This is an integer that specifies how many maximum concurrent active threads are allowed to process objects queued onto this instance of ManagedIOCP. I used a no argument constructor, which defaults to a maximum of one concurrent active thread.
From a thread that needs to wait on objects queued onto the ManagedIOCP instance, call the Register() method on the ManagedIOCP instance. This will return an instance of the IOCPHandle class. This is like native Win32 IOCP handle, using which the registered thread can wait on the arrival of objects onto the ManagedIOCP instance. This thread can use the Wait() method on the IOCPHandle object. The Wait() will indefinitely wait until it grabs an object queued onto the ManagedIOCP instance to which the calling thread is registered. It either comes out with an object, or an exception in case the ManagedIOCP instance is stopped (we will cover this later).
view plain copy to clipboard print ?
3.1. Advanced usage
Following are the advanced features of Managed IOCP that need to be used carefully.
Managed IOCP execution can be paused at runtime. When a Managed IOCP instance is paused, all the threads registered with this instance of Managed IOCP will stop processing the queued objects. Also, if the 'EnqueueOnPause' property of the ManagedIOCP instance is false (by default, it is false), then no thread will be able to dispatch new objects onto the Managed IOCP instance queue. Calling Dispatch on the ManagedIOCP instance will throw an exception in the Pause state. If the 'EnqueueOnPause' property is set to true, then threads can dispatch objects onto the queue, but you need to be careful while setting this property to true, as this will increase the number of pending objects in the queue, thus occupying more memory. Also, when the Managed IOCP instance is re-started, all the registered threads will suddenly start processing a huge number of objects thus creating greater hikes in the system resource utilization.
mIOCP.Pause();
Once paused, the ManagedIOCP instance can be re-started using the Run method.
mIOCP.Run();
The running status of the Managed IOCP instance can be obtained using the IsRunning property:
bool bIsRunning = mIOCP.IsRunning;
You can retrieve the System.Threading.Thread object of the thread associated with the IOCPHandle instance, from its property named 'OwningThread'.
3.2. Demo Application
I provided two demo applications with similar logic. The first is implemented using Managed IOCP, the other using native Win32 IOCP. These two demo applications perform the following steps:
Create a global static ManagedIOCP instance or native Win32 IOCP.
Create five threads.
Each thread will dispatch one integer value at a time to the ManagedIOCP instance or native Win32 IOCP until the specified number of objects are completed.
Start (creates a new set of five threads) and stop (closes the running threads) the object processing.
The Sonic.Net (ManagedIOCP) demo application additionally demonstrates the following features of Managed IOCP that are unavailable in the Win32 IOCP:
Pause and continue object processing during runtime.
Change concurrent threads at runtime.
Statistics like, Active Threads, Maximum Concurrent threads, Queued Objects Count and Running Status of Managed IOCP.
Below is the image showing both the demo applications after their first cycle of object processing:
Demo application results
As you can see in the above figure, Managed IOCP gives the same speed (slightly even better) as native Win32 IOCP. The goal of these two demo applications is _not_ to compare the speed or features of Win32 IOCP with that of Managed IOCP, but rather to highlight that Managed IOCP provides all the advantages of native Win32 IOCP (with additional features) but in a purely managed environment.
I tested these two demo applications on a single processor CPU and a dual processor CPU. The results are almost similar, in the sense the Managed IOCP is performing as good as (sometimes performing better than) native Win32 IOCP.
3.3. Source and demo application files
Below are the details of the files included in the article's Zip file:
Sonic.Net (folder) - I named this class library as Sonic.Net (Sonic stands for speed). The namespace is also specified as Sonic.Net. All the classes that I described in this article are defined within this namespace. The folder hierarchy is described below:
引用
The Assemblies folder contains the Sonic.Net.dll (contains the ObjectPool, Queue, ManagedIOCP, IOCPHandle, and ThreadPool classes), Sonic.Net Demo Application.exe (demo application showing the usage of ManagedIOCP and IOCPHandle classes), and Sonic.Net Console Demo.exe (console demo application showing the usage of ThreadPool and ObjectPool classes).
The Solution Files folder contains the VS.NET 2003 solution file for the Sonic.Net assembly project, Sonic.Net demo application WinForms project, and Sonic.Net console demo project.
The Sonic.Net folder contains the Sonic.Net assembly source code.
The Sonic.Net Console Demo folder contains the Sonic.Net console demo application source code. This demo shows the usage of the Managed IOCP ThreadPool, which is explained in my Managed I/O Completion Ports - Part 2 article. This demo uses a file that will be read by the ThreadPool threads. Please change the file path to a valid one on your system. The code below shows the portion in the code to change. This code is in the ManagedIOCPConsoleDemo.cs file.
The Sonic.Net Demo Application folder contains the Sonic.Net demo application source code.
Win32IOCPDemo (folder) - This folder contains the WinForms based demo application for demonstrating Win32 IOCP usage using PInvoke. When compiled, the Win32IOCPDemo.exe will be created in the Win32IOCPDemo/bin/debug or Win32IOCPDemo/bin/Release folder based on the current build configuration you selected. The default build configuration is set to Release mode.
4. Inside Managed IOCP
This section discusses the how and why part of the core logic that is used to implement Managed IOCP.
4.1. Waiting and retrieving objects in Managed IOCP
Managed IOCP provides a thread safe object dispatch and retrieval mechanism. This could have been achieved by a simple synchronized queue. But with synchronized queue, when a thread (thread-A) dispatches (enqueues) an object onto the queue, for another thread (thread-B) to retrieve that object, it has to continuously monitor the queue. This technique is inefficient as thread-B will be continuously monitoring the queue for arrival of objects, irrespective of whether the objects are present in the queue. This leads to heavy CPU utilization and thread switching in the application when multiple threads are monitoring the same queue, thus degrading the performance of the system.
Managed IOCP deals with this situation by attaching an auto reset event to each thread that wants to monitor the queue for objects and retrieve them. This is why any thread that wants to wait on a Managed IOCP queue and retrieve objects from it has to register with the Managed IOCP instance using its 'Register' method. The registered threads wait for the object arrival and retrieve them using the 'Wait' method of the IOCPHandle instance. The IOCPHandle instance contains an AutResetEvent that will be set by the Managed IOCP instance when any thread dispatches an object onto its queue. There is an interesting problem in this technique. Let us say that there are three threads, thread-A dispatching the objects, and thread-B and thread-C waiting on object arrival and retrieving them. Now, say if thread-A dispatches 10 objects in its slice of CPU time. Managed IOCP will set the AutoResetEvent of thread-B and thread-C, thus informing them of the new object arrival. Since it is an event, it does not have an indication of how many times it has been set. So if thread-B and thread-C just wake up on the event set and retrieve one object each from the queue and again waits on the event, there would be 8 more objects left over in the queue unattended. Also, this mechanism would waste the CPU slice given to thread-B and thread-C as they are trying to go into waiting mode after processing a single object from the Managed IOCP queue.
So in Managed IOCP, when thread-B and thread-C call the 'Wait' method on their respective IOCPHandle instances, the method first tries to retrieve an object from the Managed IOCP instance queue before waiting on its event. If it was able to successfully retrieve the object, it does not go into wait mode, rather it returns from the Wait object. This is efficient because there is no point for threads to wait on their event until there are objects to process in the queue. The beauty of this technique is that when there are no objects in the queue, the IOCPHandle instance Wait method will suspend the calling thread by waiting on its internal AutoResetEvent, which will be set again by the Managed IOCP instance 'Dispatch' method when thread-A dispatches more objects.
4.2. Compare-And-Swap (CAS) in Managed IOCP
CAS is a very familiar term in the software community, dealing with multi-threaded applications. It allows you to compare two values, and update one of them with a new value, all in a single atomic thread-safe operation. In Managed IOCP, when a thread successfully grabs an object from the IOCP queue, it is considered to be active. Before grabbing an available object from the queue, Managed IOCP checks if the number of currently active threads is less than the allowed maximum concurrent threads. In case the number of current active threads is equal to the maximum allowed concurrent threads, then Managed IOCP will block the thread, trying to receive the object from the IOCP queue. To do this, Managed IOCP has to follow the logical steps as mentioned below:
Get the new would-be value of active threads (current active threads + 1).
Compare it with the maximum allowed concurrent threads.
If new would-be value is <= the maximum number of allowed concurrent threads, then assign the would-be value to the active threads.
In the above logic, step-3 consists of two operations, comparison and assignment. If we perform these two operations separately in Managed IOCP, then for instance, thread-A and thread-B might both reach the conditional expression with the same would-be value for active threads. If this value is less than or equal to the maximum number of allowed concurrent threads, then the condition will pass for both the threads, and both of them will assign the same would-be value for the active threads. Though the active thread count may not increase in this scenario, the actual number of physically active threads will be more than the desired maximum number of concurrent threads, as in the above scenario both the threads think that they can be active.
So Managed IOCP performs this operation as shown below:
Gets the current value of active threads and stores it in a local variable.
Gets the new would-be value of active threads (current active threads + 1).
Compares it with the maximum number of allowed concurrent threads.
If new would-be value is <= the maximum number of allowed concurrent threads, then CAS(ref activethreads variable, would-be value of active threads, current value of active threads stored in a local variable in step 1). Come out of the method if the would-be value is greater than the maximum number of allowed concurrent threads.
If CAS returns false then go to step 1.
In the above logic, the CAS operation supported by the .NET framework (Interlocked.CompareExchange) is used to assign the new would-be value to active threads only if the original value of active threads has not been changed since the time we observed (stored in the local variable) it before proceeding to our compare and decide step. This way, though two threads might pass the decision in step-4, one of them will fail in the CAS operation thus not going into active mode. Below is the active threads increment method extracted from the ManagedIOCP class implementation:
I could have used a lock mechanism like Monitor for the entire duration of the active threads increment operation. But since this is a very frequent operation in Managed IOCP, it would lead to heavy lock contention, and will decrease the performance of the system/application in multi-CPU environments. This technique that I used in Managed IOCP is generally called lock-free technique, and is used heavily to build lock-free data structures in performance critical applications.
4.3. Concurrency management in Managed IOCP
Concurrency is one area that native Win32 IOCP excels in. It provides a mechanism where the maximum number of allowed concurrent threads can be set during its creation. It guarantees that at any given point of time, only the maximum allowed concurrent threads are running, and more importantly, it sees to it that _atleast_ the maximum allowed concurrent threads are _always_ notified/awakened to process completion events, if the number of threads using its IOCP handle is more than the maximum number of allowed concurrent threads.
Managed IOCP also provides the above two guarantees with more features like ability to modify the maximum number of allowed concurrent threads at runtime, which native Win32 IOCP does not provide. Managed IOCP provides this guarantee using the Compare-And-Swap (CAS) technique in its Wait mode, as described in the previous section (4.2). When a thread waits on its IOCPHandle instance to grab a Managed IOCP queue object, it first tries to become active by incrementing the active thread count using the CAS technique as mentioned in the previous section (4.2). It it fails to increment the number of active threads, it means that the number of current active threads is equal to the maximum number of allowed concurrent threads and the calling thread will go into Wait mode. You can see this in the code implementation of the IOCPHandle::Wait() method in ManagedIOCP.cs, in the attached source code ZIP file.
I could have used Win32 Semaphores to limit the maximum number of allowed concurrent threads. But it will defeat the whole purpose of Managed IOCP, being completely managed, as .NET 1.1 does not provide a Semaphore type. Also, I wanted this library to be as compatible as possible with the Mono .NET runtime. These are the reasons I did not explore the usage of semaphore for this feature. Maybe, I'll take a serious look at it if .NET 2.0 has a Semaphore object.
The second feature of IOCP as described in the beginning of this section is described in more detail in the next section (dispatching objects in Managed IOCP).
4.4. Dispatching objects in Managed IOCP
Managed IOCP maintains a queue of IOCPHandle objects that are waiting on it to receive objects. When an object is dispatched to it by any thread, it pops out the next item (a IOCPHandle object) in the queue. It then sets the AutoResetEvent of the IOCPHandle object that is popped out. Before doing that, Managed IOCP tries to evaluate whether the thread associated with the popped out IOCPHandle can be used to process the object. It does it by checking whether the thread is in waiting mode using its IOCPHandle instance's Wait method, or the thread is running. If so, it sets its AutoRestEvent so that the thread wakes up and processes the object if it is waiting on IOCP, or if it is running (which means it is not suspended for some reason).
If the thread is not waiting on IOCPHandle and is also not in the running state, Managed IOCP assumes that the thread is waiting on some external resources other than its IOCPHandle. It then simply decrements the active thread count, so that any other thread waiting on Managed IOCP or in running state could process objects dispatched to the Managed IOCP queue.
Below is the method that is used to choose a thread when an object is dispatched onto Managed IOCP:
Collapse
This technique provides the second aspect of native Win32 IOCP's concurrency management that guarantees _atleast_ the maximum number of allowed concurrent threads are _always_ notified/awakened to process queued objects, if the number of threads using the Managed IOCP instance is more than the maximum number of allowed concurrent threads.
More Info visit:
http://www.codeproject.com/cs/library/managediocp-part-2.asphttp://www.codeproject.com/cs/library/managediocp.asp