Hacker News new | past | comments | ask | show | jobs | submit login

From the announcement: "Performance on par, or better, than pure event-driven Web servers."

Is there any more information on how they arrived at this conclusion?




The second "Core Enhancement" is: "The Event MPM is no longer experimental but is now fully supported.". I would imagine it is because that module's inherent design offers the same underlying system overhead benefits that people are claiming with their event-driven servers.

The idea is that Apache features "multi-processing modules", which control how incoming connections are eventually translated into the state of the running program. These are not just simple tweaks to a fixed pipeline: it is possible to drastically alter the mechanism. As an example: mpm_winnt can be drastically different than the Unix implementations.

The original implementation was mpm_prefork, which maintains a pool of fork()'d processes to handle incoming connections. However, the more reasonable installations have used mpm_worker for years now, which keeps a pool of threads in each process (developed back when thread pools were the rage).

However, a while back someone built mpm_event, which uses the same underlying system calls that people use to build their "purely event-driven servers" to handle the sockets. This has been marked "experimental" for a while, initially due to bad interactions with other modules that weren't quite prepared to be evented (such as mod_ssl), but honestly it has been stable for years.

(I've been using it in production on a site with tens of millions of users, with SSL, and if anything mpm_event made things run better by working around a really irritating issue in Python: specifically, that until very very very recently the Python libraries simply failed to handle EINTR in a reasonable way; mpm_event seems to setup the signal and thread environment in a way that made me stop getting entire outgoing HTTP API requests to fail with EINTR ;P.)


Can you go into more detail about the right way to handle EINTR? When I wrote a server I wound up using the "if at first you get EINTR'd, try, try again".


I'm not entirely certain at what level you are asking, but if I understand your question: yes, "try, try again". There are some legitimate reasons why you might want to "handle" an EINTR (such as for canceling threads), but honestly they are super-advanced usage (and normally something abstracted by your threading library): 99.9999% of the time you called a blocking function, knew it might block, and intended it to block until it did something.

The problem with Python was that it was not retrying: it would just throw an EINTR i/o exception from its lowest-level read() bridge, and then most of the code that called that would not catch the exception. The result would be that you'd make an HTTP request and the top-level GET call would just throw "EINTR": it is not reasonable (or even semantically valid) to have to retry the entire HTTP request because a call to read() happened to fail while streaming the response.

This was sufficiently ludicrous (and sufficiently simple to fix) that I had a patch I'd apply to Python's core libraries every time I installed or upgraded my system. ... and, before anyone tries to call me on "patch or GTFO": someone filed a bug, with a working patch (although people tried to quibble with it; I believe incorrectly, and it certainly didn't change the result), in January of 2007, and it wasn't resolved until mid-2010. I didn't come up with the fix: I just pulled it out of the bug tracker.

http://bugs.python.org/issue1628205


Thanks for the history lesson and the thorough response. I'm going to sleep a little better tonight.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: