8th
Pipelined: Easy Event-Driven Programming for Ruby
One of the primary performance issues with Rails and Ruby applications in general is their one-request-per-process architecture. I finally got around this weekend to publishing some code I’ve had sitting on my computer for awhile to address this problem.
The Problem
Rails apps tend to be difficult to scale because they handle only one request-per-instance at a time. Switching to a multi-threaded approach using a newer stack like Thin+Merb can help, but ultimately the overhead of threads limits the number of concurrent requests the server can handle at one.
Everyone knows that, really, the best way to solve this problem is through an event driven programming model like the one provided by EventMachine. Thin uses this model to speed up its raw request dispatching logic, but once you get into the Rails or Merb applications themselves, event-driven programming rarely happens.
Ironically, this is where event-driven models are needed most. Whenever you write an action to handle some request, it will often spend a big chunk of its time waiting on other services, such as the database. If your code were written with an event-driven model, the server would be able to stop executing your action for awhile and process other requests while it waits for the response from the database to arrive.
Event driven models are very powerful but typically very difficult to program in. It’s hard to get your mind around this twisted model of development (in fact, Python’s event driven model is called Twisted for this very reason.) It was hard, that is, until now…
Introducing Pipelined
Pipelined is a little bit of Ruby code I whipped together several months ago that turns event driven programming in Ruby right side out. It patches EventMachine and Thin so that you can write code like this:
class MyController < Merb::Controller
def myaction
data = {}
pipeline do
#perform some long running code
data = { :new_data => :value }
end
return render(data)
end
end
The code above will actually pause the above action and then schedule the block inside of the pipeline command to run later in the event loop. When the event finishes running, the action handler will actually resume execution to finish rendering the data and return it.
In the mean time, in between the pipeline code and the action code, the server can process other requests coming in on the same thread.
If that doesn’t get you excited yet, how about some data. I wrote a simple test server that simulates an action that takes 100msec to complete, most of it spent waiting on a backend resource like a database. One version uses the typical one-request-at-a-time approach used by Rails, the other uses pipelined. Here are the results:

As you might expect, the number of requests a single non-pipelined instance could respond to caps out at about 9.34 (1sec/100msec + a little overhead). The pipelined version, on the other hand, is able to handle other requests while it waits on its imaginary “database” to complete, and thus tops out at about 220 requests/sec. Yep, thats a 20x gain in server capacity.
To put it another way, let’s look at the average time it took my test server to respond to each request (this is one a Macbook Air):

To me, this is the most exciting part of pipelining. When you exceed the maximum number of requests per second a non-pipelined server can handle, its average response time actually starts to go up because all other requests go into a queue. Pipelining will do the same thing, but it will do so at a much slower rate, yielding a far better overall performance profile.
Of course, real world code does more than wait on backend resources and many times you can reduce the response time for requests using other techniques such as caching, but the point remains that pipelining is a simple way to significantly increase the performance capacity of many typical Rails applications.
Where to from here?
Pipelined is not really a library as much as it is some base code that needs to be included in a variety of other libraries. I have posted the sample project I used for the demo along with a file that patches EventMachine and Thin as appropriate. I will be submitting these as patches to those projects as well.
In addition, the simple way to incorporate this technology would be to pipeline ActiveRecord or DataMapper so that they release control of the event loop while waiting on database calls to complete. Pipelined makes this easy and doing so could instantly improve the performance of many Rails applications with little or no code changes on their own.