This was a blog post I wrote for my employer's blog back when I was a "consultant".

In my last post I introduced the Disruptor pattern — a way of writing efficient, processor friendly code that is easy to parallelize, while maintaining developer sanity. This post is a case study of a real world application of this pattern in C#, along with a complete code example. Yes, names are changed, but this is a true story.

Say a client, CodeExampleCorp, the premier provider of todo lists, wants to move their Customer Relationship Management to the cloud. The boss has decided that all todo lists will be replicated to TodoListForce.com, allowing sales reps to better assist customers.

However, the QA engineers discovered that TodoListForce.com will slow to a crawl if we send it an unbounded number of requests. During peak hours, the number of todo list updates greatly exceeds that threshold.

We needed a way of batching many individual requests into larger chunks while transforming their format, and it needed to be fast, both in terms of latency and throughput. We decided to apply the disruptor to this stream processing problem because the requirements were constantly changing, and our initial solution using queues wasn't flexible enough to keep up. We knew it would help us build our process quickly and keep it agile.

Two groups of EventProcessors advancing along the ring buffer to the right and mutating events as they go. Each processor can run as a separate task or thread, however, they must never read or write to the same memory without being separated in the sequence.

My code example shows a similar solution to the one we ended up with. Updates come in on the stream, and we have multiple processors working to deserialize them. After that, we split the updates up according to the request types that TodoListForce.com accepts, and batch them into larger requests. Finally, the requests are sent, and when finished, the results will be logged.

Each one of the colored boxes is an EventProcessor, and the arrows represent ordering. For example, all deserializers must advance past an Event before the RequestBuilders are allowed to process it. The process dependency graph is built with a fluent syntax: ```javascript var deserialize = GetDeserializers(numberOfDeserializers);

var groupIntoRequests =
GetRequestBuilders(listsPerRequest);

disruptor.HandleEventsWith(deserialize)
.Then(groupIntoRequests);

// Since the Request Senders are
// AsyncEventProcessors instead of
// EventHandlers(synchronous)
// We have to manually create a sequence barrier
// and pass it in instead of just calling
// .Then() again.
var barrierUntilRequestsAreGrouped =
disruptor.After(groupIntoRequests)
.AsSequenceBarrier();

var sendRequests =
GetRequestSenders(
ringBuffer,
barrierUntilRequestsAreGrouped,
RequestSenderMode.Callback
);

disruptor.HandleEventsWith(sendRequests);

var writeLog = GetFinalLoggingEventHandler();

disruptor.After(sendRequests)
.Then(writeLog);

var configuredRingBuffer = disruptor.Start();


</td>
</tr>
</table>

I think it's a good way of organizing a process. Each step in the process is encapsulated, and written to **one common, flexible interface**, the `EventProcessor`.

I think that **flexibility** is one of the disruptor pattern's best features. If I wanted join all request senders into one processor, I easily could. If I wanted to fuse the RequestBuilders and RequestSenders together, I could do that too. If I needed to insert a new step in the process, it would be trivial. Multiple input streams with different priority? No sweat.

This flexibility also makes it easy to optimize the process. In my example project, there is one NUnit test which configures the disruptor based on parameters and runs a load test for each combination of settings. The flexibility makes it **easy to find big performance wins** through experimentation. It only took me a few minutes to improve performance by a factor of 10 after writing the test.  

In the end, it reached about 90 MB/s throughput, processing about 6000 requests per second on my laptop.  Maybe nothing to write home about, but it's approaching the speed required to handle the something like the [Twitter Firehose](https://dev.twitter.com/streaming/reference/get/statuses/firehose) with a single node, and I think that's good enough for a couple hours of work.


Comments