
Understanding the code inside Tornado, the asynchronous web server - dhotson
http://golubenco.org/?p=16
======
ZoFreX
> Lets say you have 20 threads. You improved performance 20 times, so the rate
> is now 4 request per second. Still, way too small. You can keep throwing
> threads at the problem, but threads are expensive in terms of memory usage
> and scheduling. I doubt you’ll ever reach hundreds of requests per second
> this way.

Let's say I'm using Java instead of Python. Lets say I use a lot more threads
than 20. I will reach _thousands_ of requests per second this way. There are
real problems with that approach, and event-driven approaches have genuine
advantages, but can people please stop straight-up _lying_ and saying that you
can't just throw a few thousand threads at a problem, because you can, and
people do.

~~~
ankrgyl
Even with Python (CGI backed by Apache), you can scale with threads. The issue
of scaling with threads vs. events is a pretty hot debate, and I think the
author sets up this kind of criticism by addressing it poorly. Personally, I
fall into the event-driven camp, because it involves the operating system as
little as possible (only file descriptors). The C10K link has a great overview
of the threaded approach (<http://www.kegel.com/c10k.html#threaded>). Read
through the rest of the page for a discussion on the pros/cons of other
approaches.

~~~
cdavid
Another issue with evented frameworks in python is that you have to ensure
_nothing_ is blocking anywhere in your code, which becomes harder the more
complex your application becomes. People will often mention IO, database,
etc... forgetting that it is also an issue if your request handler takes CPU
for N ms (e.g. encoding a relatively large payload in json, etc...).
_everything_ needs to be written with async in mind.

Languages/frameworks where async IO is an implementation detail (like some web
frameworks in haskell) don't have this issue.

~~~
ankrgyl
I don't think of this as a problem. Tornado is just a small and lightweight
non-blocking HTTP server with a couple of convenient libraries. There are some
pretty awesome, elegant solutions to the problem you point out, and I like to
think of these as extensions to the small Tornado framework.

One really cool one is <http://thomas.pelletier.im/2010/08/websocket-tornado-
redis/>, which demonstrates how to use threading to support Redis pub/sub.

For heavy computational tasks that aren't time-critical, you could have an
accessory worker thread that chugs queued computations (yay first-class
functions) in computational downtime.

------
jparise
Some of this is a bit dated. It looks like it was written in late 2009,
judging by the first comment. For example, IO loop timers are now stored using
a heapq[1] instead of a sorted list, and many of the other issues have been
similarly addressed over the last year and a half.

Still, this is a good general overview of Tornado's internals.

[1]
[https://github.com/facebook/tornado/commit/b6c4d6d20196fa4fe...](https://github.com/facebook/tornado/commit/b6c4d6d20196fa4fec936ddc6fb252698c1933b9)

~~~
bdarnell
The heapq change was just last week and has not yet been in any release, so
that's a perfectly reasonable omission. :) But yes, this is about a pre-1.0
release, so I would encourage anyone interested in exploring tornado internals
to check out a more recent release.

------
dad
There was a very good talk about this at #nodeconf this week by Tom from
Joyent. The basic summary was that threaded network servers, like Apache, say,
approach the speed problem by "pre-allocating" resources, a chunk for each
thread/instance. The issue is that there are a finite number of resources on
each machine (RAM mostly), and blocked threads are eating up a slice of those
resources even when doing _no work_. This sets the upper bound of number of
simultaneous connections that can be handled.

In an event based system the overhead for each connection is at least three
orders of magnitude lower, sometimes four or five (hope I'm remembering this
right). This translates into _considerable_ increase in number of connections
that can be handled simultaneously, not quite the equivalent number of orders
of magnitude due to other aspects of the system becoming more of the bottle-
necks, but still dramatic.

This was a data-driven observation, not conjecture or assertion. I listened in
during a break while the nodejs guys argued about approaches to getting TLS
working better - they were worrying about the 1MB overhead for a TLS
connection because that's a significant percentage of a connection handler.
Think about that for a minute in the context of an apache threaded instance.
1MB matters? wow!

I'm new to this area and this was very interesting stuff and seemed related to
the various discussions below about a threaded system can do what an event
based system can do.

------
rdtsc
> I also feel obliged to talk a bit on nonblocking IO or asynchronous IO (AIO)

Wait, that is not correct. His IO is blocking. He gets a notification when
data is ready, from the select/poll/epoll (the asynchronous part), but when IO
is actually performed (read, write, recv, etc), the operation it is still
happening in the main thread in user space, and it blocks it.

Currently only the file system (I think) has truly asynchronous and non-
blocking IO. It is provided by the aio_* set of system calls and is has been
sort of an exotic beast (it is not that popular).

Here is a good chart of the possible IO types and their combinations:

<http://www.ibm.com/developerworks/linux/library/l-async/>

And of course:

<http://www.kegel.com/c10k.html>

~~~
justincormack
No the network sockets are set to non block and will return E_AGAIN once you
have emptied the kernel buffer. Thats still non blocking.

I dont think he is using aio_* which Linux does not really implement usefully.

~~~
rdtsc
No. They would return E_AGAIN after the data has been read or written in
blocking mode. If select comes back and says "read is ready on socket 4" then
user code does read on socket 4 and gets (say) 4K of data followed by E_AGAIN.
Getting that 4K of data happens in a blocking mode. To make it non-blocking
you would provide a pointer to your user space buffer to the kernel and it
would determine when IO is ready _and_ proceed to put that data in the buffer.

~~~
haberman
The way you are using the term "blocking" is non-standard and not that useful.
You are calling the syscall "blocking" just because it performs its work
inline (in this case, copying a kernel buffer into a user-space buffer). By
this definition, every syscall is blocking, even aio_read() because even
aio_read() performs _some_ work inline (namely enqueing a read request).

In common usage, an I/O operation is considered "blocking" if the system call
does not return until data is available. If on the other hand you set
O_NONBLOCK on a fd, you will get EAGAIN when no data is available, so clearly
that is the accepted meaning of "nonblocking."

~~~
rdtsc
> You are calling the syscall "blocking" just because it performs its work
> inline.

Yes because there is _another_ mode of IO where it doesn't have to do that. So
the reason that is called "blocking" is because there is another mode of
operation (albeit it is exotic) where the process does not block while
performing IO -- the kernel reads/writes the data on behalf of the process. If
this second mode of performing IO did not ever exist you could just
interchange arbitrarily asynchronous and non-blocking as synonyms.

Since you didn't bother reading the link I posted here is the basic matrix of
IO operations:

    
    
                        Synchronous            Asynchronous
    
        Blocking         read/write           select/[e]poll
    
     
        Non-Blocking    O_NONBLOCK+E_AGAIN      aio_*
    
    
    

There has been talk and various attempt at implementing aio_* style IO for
network sockets on Linux, but nothing good so far.

> Even aio_read() performs some work inline (namely enqueing a read request).

Enqueuing the read request is not the same as actually copying gigabytes or
terabytes of data from disk where the actual work is performed.

~~~
tptacek
I think the DeveloperWorks article you've cited is simply wrong, and that's
sent you into a tailspin. Select and poll aren't "ways to implement
asynchronous blocking I/O".

I think you'll find these links more helpful than the misleading one you've
been using.

[http://www.circlemud.org/~jelson/software/fusd/docs/node36.h...](http://www.circlemud.org/~jelson/software/fusd/docs/node36.html)

<http://people.freebsd.org/~hmp/stuff/docs/freebsd_kse.pdf>

I could always be wrong about this stuff; maybe I'm the one with the broken
semantics. But I'm pretty sure I'm not.

------
ZeSmith
Reminds me of this: [http://teddziuba.com/2009/09/twisted-vs-tornado-youre-
both.h...](http://teddziuba.com/2009/09/twisted-vs-tornado-youre-both.html)

------
maurycy
Personally, I would be more interested in understanding the code inside
Tornado, a column of air. :-)

