Hacker News new | comments | show | ask | jobs | submit login

This post is a useful rundown on where threads stand in Ruby now, but it hangs on a bit of a straw man argument. The argument for evented concurrency assumes threads work. Async design isn't a reaction to MRI's crappy green threads.

It's also not a very compelling argument for threading to say "web programs don't share a lot of state, so you don't have to worry about synchronization". If all you do are CRUD apps, you can indeed punt scaling to the database. That doesn't mean threads are more effective than events; it means you made concurrency and synchronization someone else's problem. There's nothing wrong with that, but it's not a convincing demonstration of threading.

As I tried to make clear throughout the post, I'm not really making an argument for a huge amount of threads, or programming that involves a lot of exposed shared state.

I'm making an argument about how threads are used (in real life) in web development, an area where it's trivial to make concurrency and synchronization someone else's problem. Despite this, I've heard a number of hypesters throw around the idea that this scenario is an example of whether threading fails and moving to async is required.

I agree with you that this is a weak argument, and I hope to see people understand better the difference between:

a) an application that NEEDS to handle huge amounts of concurrent users (because most of them are idle for most of their lives), and

b) an application that spends a non-trivial amount of time using the CPU, and therefore does not need more than a few threads to fully utilize the CPU

There are different cases, and while those of us with a good grasp of the subject understand the difference, a lot of people have conflated the two ideas, and then further conflated the problems of thread synchronization in these cases as well.


I think you and I probably agree that it'd be a bit of a fools errand to try to make Rails apps more asynchronous. Rails works fine, and as a synchronous design it's straightforward to thread.

It was things like the comparison to Node (which I don't use) and the comment about how well async had worked for you in browser js --- which implicitly somewhat demerited serverside async --- that made me think you might have been reaching for something more ambitious with this post.


Sorry about that :)

I explicitly referenced a chat server (lots of concurrent, mostly-idle requests) as a case where I'd personally use an async solution.

There are certainly middle-ground cases where the question is muddier (and more religious, likely), and I just wanted to set the record straight that Rails itself is not really in the middle. I'm glad that you agree :)


> The argument for evented concurrency assumes threads work. Async design isn't a reaction to MRI's crappy green threads.

Truly? To my understanding, event loops require inversion of control (and likely callbacks and broken exception handling). This is a large cost that requires a benefit to be worth it. I understand that benefit to be: you don't have to deal with threads (or bad implementations of such).

This blog post comes to mind: http://www.unlimitednovelty.com/2010/08/multithreaded-rails-...


Not being a patterns guy or a Fowlerite, I can't tell you whether async "requires inversion of control". Apropos nothing, I'm also unlikely to recognize "dependency injection" when I see it, as my first language was C, not Java.

Async code isn't "likely" to require callbacks; it will almost certainly involve callbacks, those being a central design feature of async code. You should play with the idea for a bit before forming an opinion about it.

I don't know what you mean by "broken exception handling". This may be a Rails-ism I'm unfamiliar with. I'm very familiar with Rails (we ship a fairly large product built on it), but I've never tried to shoehorn async I/O into it; like Twitter and presumably every other large site, we do Rails on the front-end/UI and fast things on the backend, in our case with EventMachine.

Like I said: I'm not arguing that Rails threading is bad, or even that Rails should have better async support. If I cared that much about the performance of my front end, I probably wouldn't be serving requests directly off Rails. Rails developers may very well be better off with threads. But that fact has little to do with the merits of threading and async as concepts.


'I don't know what you mean by "broken exception handling".'

Take the following Python-esque (but not Python) psuedo-code in a threaded language:

        header = read(socket, header_size)
        if has_short_flag(header):
             return read_until_closed(socket)
             handle_message(header, socket)
    except SocketClosedException:
        log("lost socket while processing headers in packet")
The evented equivalent of this code will have to be chopped into pieces at each of the read calls, and it is impossible to wrap the second read calls in the same exception handler like this. You can manually route exceptions around if you're careful, but that is definitely a pain in the ass and is sometimes very hard to test (hard to test exception handling for an exception you can't really fake for some reason). (Of course you can't always test it in threaded code either, but it's radically simpler there and therefore less likely to break, you don't have to test the plumbing.)

This is one of the reasons I've spent the last few years fleeing event-based code towards things like Erlang, rather than running towards it; I've been doing event-based code for non-trivial work and things well beyond "demos" and the plumbing just explodes in your face if you want to build actually-robust software where simply crashing in the middle of a request isn't acceptable. Despite my best application of good coding practices and refactoring you still can't get close to the simplicity of something like Erlang code.

(By the way, if you are stuck in evented land, one of the things I have learned the hard way is that anywhere your evented API has an error callback, you absolutely must provide one that does something sensible. I now always wrap such APIs in another layer of API that does nothing but crash as soon as possible if no error callback is provided. If you can't figure out what your error callback for a given such call should be, that's your design trying to tell you something's wrong.)


If you're an Erlang advocate, I'm not going to convince you of the utility of eventing in any other language. I respect your decision but am uninterested in choosing a programming language based on a concurrency model.

That said, your (admittedly hypothetical) example is hideously broken. In EventMachine, and written properly:

  def receive_data(buf)
    @buf << buf
    while have_complete_request?
      rescue => e
Notice the single exception handler bracketing the entire request. (This code is wordier than I'd like because I hoisted the exception handler out of handle_next_request to illustrate it).

The mistake you made (and it's common to a lot of evented code) is in driving the entire system off raw events. Don't do that. Buffer, to create natural functional decomposition.

Evented code is rarely as simple as synchronous code, but there's no reason it has to be needlessly choppy.

That said, I think this design is overvalued. Yes, it's true, you (probably) can't always wrap an entire request's processing in a single exception handler in any evented Ruby library I know about. But I wouldn't wrap an entire request handler in a single exception handler in any case! If I was preparing to deal with a database exception, I'd wrap the code making the database call in an handler for the database exception. If I was preparing for a filesystem exception, &c &c &c.

Incidentally, I've been doing very, very, very large scale evented systems for going on about 10 years now (large scale: every connection traversing tier 1 ISP backbones), and, sorry, this stuff has never blown up on me. I may have been shielded from exception drama by working in C/C++, where exceptions are not the norm. I was a thread guy before that. Threads definitely did blow up on me, a lot.


I don't think you can do that with Javascript or Node.js. When you have a callback, you lose all stack above it.

See, I wouldn't even call that "evented code" in the way that people are using the term, regardless of what it says on the tin, precisely because you aren't losing the stack frame here and can still catch exceptions and such. Evented to my mind is something like Node.js is when you have to chop up your code manually. At least, I've never seen anybody demo Node.js code that isn't chopped up manually and I am at a loss as to what features of Javascript would let you translate that Ruby snippet directly without losing something fundamental about the stack.

Under the hood, everything's event-based (with optional preemptive multitasking), there's just varying levels of compiler optimization that affects how much you have to do manually and how much you have to worry about it. The inner event loop of Erlang and the inner event loop of Node.js and in fact the inner event loop of just about anything nowadays looks pretty much the same.

That's not the way in which I say evented code blows up. If you can write like that, it doesn't blow up, because you don't have to sit there and basically manually implement your own stack handling if you want anything like that sort sane exception handling, it all just works.

Since this is a terminology issue there is, as always, grey areas, but since I mostly use the term evented in the context of the Node.js hype I tend to use it that way. I've been doing stuff like your snippet for a while too and it hasn't blown up on me either, which is why I'm so down on the style of coding Node.js entails, which does.

The point of my snippet is not that that is a brilliant choice intrinsically, the point is that you don't have the choice and end up implementing anything like that manually.


You've lost me. Nothing in this code depends on having a single consistent stack from main() to handle_next_request().


You need a "consistent" stack in the whole of everything executing under "handle_request" if you want that "one single exception handler bracketing the entire request". You don't get that in Javascript; you need a language with more power. There are several viable options: Threading, continuations, some hacked up thing like Python's yield (which is neat, but limited). But something.

Javascript's a very nice, dynamic language in most ways but its functions are very, very weak. They'll get better in the next version of ECMAScript, though. (Node.js will probably benefit from it.)


I'm pretty sure you're wrong about this. The exact same idiom I used to buffer up requests and feed them whole to a single function that could catch all possible exceptions works just fine in Javascript. Is this an issue with JS exceptions that I'm unaware of? I may just be talking past you.


You're talking about the function of what I posted, I'm talking about the form.

        something = some_io_that_needs_an_event_callback()
        if do_something_with(something):
    except AnyException:
is not a structure available to you in Node.js, again, pending somebody proving otherwise in Javascript (though I've made this complaint in other places where somebody really should have shown me the code). That expands to tons of code in Javascript, code which is very resistant to abstraction.

Your code block says it is available to you in Ruby. There's no reason why it shouldn't be; it's not a hard compiler transform to manage that. Taking code that looks like a "thread" but compiling it into events and using something-like-continuations to keep the stack together is on the order of a homework problem. Based on what I know about Ruby, the responsibility of those pieces is split up differently but all the pieces must be there. But it's not something Javascript can do (quite yet, and based on what I've seen in Python the 'yield' approach has its own issues).


In Python, you could write your snippet as a generator and not need to break up the function. It is also possible to throw exceptions into generators.


> event loops require inversion of control

Nope. Coroutines cut that gordian knot. Inversion of control is for when you want to make code verbose and hard to reason about!

I have a partially complete node.js-esque system in Lua (Javascript's smart brother, which has efficient and well-integrated coroutines), but I may not finish it for a while, at this point - I've gotten pulled into Erlang instead (and life, in general). I kept feeling like I was reimplementing Erlang, anyways. (My LuaSocket.select-based backend works, but my high-scalabilityTM libev backed is on hiatus, and the usual select vs. epoll/kqueue trade-offs apply.)


> It's also not a very compelling argument for threading to say "web programs don't share a lot of state, so you don't have to worry about synchronization".

Isn't this an issue with both models? Shared state is shared state, regardless of whether you use threads or an evented model. Unless you're only running on 1 CPU.


No, because evented code (usually) is scheduled cooperatively; (most) conflicts are precluded.

This goes out the window when you start forking processes and using shared memory, but at least then you're default-private instead of default-shared.


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact