This post is a useful rundown on where threads stand in Ruby now, but it hangs on a bit of a straw man argument. The argument for evented concurrency assumes threads work. Async design isn't a reaction to MRI's crappy green threads.
It's also not a very compelling argument for threading to say "web programs don't share a lot of state, so you don't have to worry about synchronization". If all you do are CRUD apps, you can indeed punt scaling to the database. That doesn't mean threads are more effective than events; it means you made concurrency and synchronization someone else's problem. There's nothing wrong with that, but it's not a convincing demonstration of threading.
As I tried to make clear throughout the post, I'm not really making an argument for a huge amount of threads, or programming that involves a lot of exposed shared state.
I'm making an argument about how threads are used (in real life) in web development, an area where it's trivial to make concurrency and synchronization someone else's problem. Despite this, I've heard a number of hypesters throw around the idea that this scenario is an example of whether threading fails and moving to async is required.
I agree with you that this is a weak argument, and I hope to see people understand better the difference between:
a) an application that NEEDS to handle huge amounts of concurrent users (because most of them are idle for most of their lives), and
b) an application that spends a non-trivial amount of time using the CPU, and therefore does not need more than a few threads to fully utilize the CPU
There are different cases, and while those of us with a good grasp of the subject understand the difference, a lot of people have conflated the two ideas, and then further conflated the problems of thread synchronization in these cases as well.
I think you and I probably agree that it'd be a bit of a fools errand to try to make Rails apps more asynchronous. Rails works fine, and as a synchronous design it's straightforward to thread.
It was things like the comparison to Node (which I don't use) and the comment about how well async had worked for you in browser js --- which implicitly somewhat demerited serverside async --- that made me think you might have been reaching for something more ambitious with this post.
I explicitly referenced a chat server (lots of concurrent, mostly-idle requests) as a case where I'd personally use an async solution.
There are certainly middle-ground cases where the question is muddier (and more religious, likely), and I just wanted to set the record straight that Rails itself is not really in the middle. I'm glad that you agree :)
> The argument for evented concurrency assumes threads work. Async design isn't a reaction to MRI's crappy green threads.
Truly? To my understanding, event loops require inversion of control (and likely callbacks and broken exception handling). This is a large cost that requires a benefit to be worth it. I understand that benefit to be: you don't have to deal with threads (or bad implementations of such).
Not being a patterns guy or a Fowlerite, I can't tell you whether async "requires inversion of control". Apropos nothing, I'm also unlikely to recognize "dependency injection" when I see it, as my first language was C, not Java.
Async code isn't "likely" to require callbacks; it will almost certainly involve callbacks, those being a central design feature of async code. You should play with the idea for a bit before forming an opinion about it.
I don't know what you mean by "broken exception handling". This may be a Rails-ism I'm unfamiliar with. I'm very familiar with Rails (we ship a fairly large product built on it), but I've never tried to shoehorn async I/O into it; like Twitter and presumably every other large site, we do Rails on the front-end/UI and fast things on the backend, in our case with EventMachine.
Like I said: I'm not arguing that Rails threading is bad, or even that Rails should have better async support. If I cared that much about the performance of my front end, I probably wouldn't be serving requests directly off Rails. Rails developers may very well be better off with threads. But that fact has little to do with the merits of threading and async as concepts.
'I don't know what you mean by "broken exception handling".'
Take the following Python-esque (but not Python) psuedo-code in a threaded language:
header = read(socket, header_size)
log("lost socket while processing headers in packet")
The evented equivalent of this code will have to be chopped into pieces at each of the read calls, and it is impossible to wrap the second read calls in the same exception handler like this. You can manually route exceptions around if you're careful, but that is definitely a pain in the ass and is sometimes very hard to test (hard to test exception handling for an exception you can't really fake for some reason). (Of course you can't always test it in threaded code either, but it's radically simpler there and therefore less likely to break, you don't have to test the plumbing.)
This is one of the reasons I've spent the last few years fleeing event-based code towards things like Erlang, rather than running towards it; I've been doing event-based code for non-trivial work and things well beyond "demos" and the plumbing just explodes in your face if you want to build actually-robust software where simply crashing in the middle of a request isn't acceptable. Despite my best application of good coding practices and refactoring you still can't get close to the simplicity of something like Erlang code.
(By the way, if you are stuck in evented land, one of the things I have learned the hard way is that anywhere your evented API has an error callback, you absolutely must provide one that does something sensible. I now always wrap such APIs in another layer of API that does nothing but crash as soon as possible if no error callback is provided. If you can't figure out what your error callback for a given such call should be, that's your design trying to tell you something's wrong.)
If you're an Erlang advocate, I'm not going to convince you of the utility of eventing in any other language. I respect your decision but am uninterested in choosing a programming language based on a concurrency model.
That said, your (admittedly hypothetical) example is hideously broken. In EventMachine, and written properly:
@buf << buf
rescue => e
Notice the single exception handler bracketing the entire request. (This code is wordier than I'd like because I hoisted the exception handler out of handle_next_request to illustrate it).
The mistake you made (and it's common to a lot of evented code) is in driving the entire system off raw events. Don't do that. Buffer, to create natural functional decomposition.
Evented code is rarely as simple as synchronous code, but there's no reason it has to be needlessly choppy.
That said, I think this design is overvalued. Yes, it's true, you (probably) can't always wrap an entire request's processing in a single exception handler in any evented Ruby library I know about. But I wouldn't wrap an entire request handler in a single exception handler in any case! If I was preparing to deal with a database exception, I'd wrap the code making the database call in an handler for the database exception. If I was preparing for a filesystem exception, &c &c &c.
Incidentally, I've been doing very, very, very large scale evented systems for going on about 10 years now (large scale: every connection traversing tier 1 ISP backbones), and, sorry, this stuff has never blown up on me. I may have been shielded from exception drama by working in C/C++, where exceptions are not the norm. I was a thread guy before that. Threads definitely did blow up on me, a lot.
Under the hood, everything's event-based (with optional preemptive multitasking), there's just varying levels of compiler optimization that affects how much you have to do manually and how much you have to worry about it. The inner event loop of Erlang and the inner event loop of Node.js and in fact the inner event loop of just about anything nowadays looks pretty much the same.
That's not the way in which I say evented code blows up. If you can write like that, it doesn't blow up, because you don't have to sit there and basically manually implement your own stack handling if you want anything like that sort sane exception handling, it all just works.
Since this is a terminology issue there is, as always, grey areas, but since I mostly use the term evented in the context of the Node.js hype I tend to use it that way. I've been doing stuff like your snippet for a while too and it hasn't blown up on me either, which is why I'm so down on the style of coding Node.js entails, which does.
The point of my snippet is not that that is a brilliant choice intrinsically, the point is that you don't have the choice and end up implementing anything like that manually.
Nope. Coroutines cut that gordian knot. Inversion of control is for when you want to make code verbose and hard to reason about!