Hacker News new | comments | show | ask | jobs | submit login
Tornado: FriendFeed's non-blocking Python web server is now open source (bret.appspot.com)
291 points by finiteloop 2848 days ago | hide | past | web | 73 comments | favorite

This is more of a combination of a web server and a web framework, which is what I find fascinating about it. Twisted has had a web server forever, but despite a significant amount of experience in Twisted Land, I'd never write a full app using it! Add to that the (apparently) standalone low-level modules and we've got some seriously awesome tech that FriendFeed/Facebook have supplied for free.

Thanks a lot, fellas. Asynchronous networking programming in Python is somewhat of a bear so it's really nice to see a tool get released that improves it!

The chat demo seems to have turned into a tech discussion about the framework. Love it.

Awesome to see more Python webservers coming to light, and fast ones at that. One observation though; anyone else notice that the bar chart is a bit misleading? It compares running Tornado in 4 processes behind nginx to running Django behind Apache and Cherrypy as a single Python process. As a fellow coworker put it, "With 4 extra processes running of course it will be 4 times as fast." I'm not opposed to this kind of comparison when the intent is to show how configurations can increase req/sec, but if the intent is to compare one Pythonic webserver to another, this comparison seems a bit unfair. That said, assuming the "Tornado (1 single-threaded frontend)" measure is simply Tornado running as a single Python process, it still is plenty faster than Cherrypy.

CherryPy is multi-threaded, so I am not sure what you are saying is correct. There are two types of servers: multi-threaded, multi-process. Tornado is multi-process. If your server is multi-threaded, it uses all of the cores without additional processes.

CherryPy did max out the CPU on all of the cores in the load test, so I think it was a fair test.

If you ran a single CherryPy instance, I don't see how that is possible. The Python global interpreter lock prevents a single process from making use of multiple cores.

No, it doesn't. I/O bound tasks can heavily use multiple cores just fine using threads.

If you're CPU bound in pure-python computations then the GIL (Global Interpreter Lock) will cause you to only make use of one core. This is intentional: threads are hard, processes are simple.

I know that, except most web application servers spend most of their time in I/O - socket, database, etc. Threads are not hard: wanton abuse of shared state and concurrent modification of the shared state is hard to get right; tread a threaded app like a message-passing app and you have a much simpler life.

I'm intimately familiar with the limitations of python threads - and right now I'm the maintainer of the multiprocessing module, which is the process-based "reply" to the stdlib threading module.

I wasn't trying to claim you don't understand python, simply that the only way to be CPU bound on multiple cores with a single python process (as is claimed) is to be running non-pure python code that unlocks the GIL before doing CPU intensive work.

<3 multiprocessing, <3 you.

Can python use mutiple cpus/cores with threads? Short answer no. Long answer, yes for some things.

Even cpu bound tasks can release the GIL to do processing(for non python api tasks).

Likewise, 4 extra processes does not mean 4 times as fast if you only have one or two cores on your machine. Check out the details at http://www.tornadoweb.org/documentation#performance for how we ran the test.

Apache/mod_wsgi creates multiple processes (or can). The issue is that Facebook didn't tell us what configuration settings they used. Did they limit the number of processes? Did they not have it start up enough at the start?

It's clear that Facebook wasn't using a single Apache/mod_wsgi process because that wouldn't get close to 2,000 requests per second. I'm sure they gave it the number of processes that were appropriate for the amount of RAM on the box - the issue is that every Apache process (of which I'm sure they dozens if not hundreds) uses up memory.

It's hard to set that up properly and that's what makes benchmarking so hard. Still, it's not hard to believe that with the weight of Apache in the process, there's going to be at least 5MB of overhead for every request. If you were saving that 5MB of RAM per client times the 2,223 requests per second, that's 11GB worth of freed RAM per second. Let's say Django uses 10MB of RAM for a request: that's an extra 50% increase in capacity right there just by eliminating the 5MB Apache process overhead. And Apache tends to be on the heavier side (5MB is somewhat low as an estimate) and if you can do it asynchronously, you're not tying up RAM as you wait for clients, so you do well.

Now, put Apache/mod_wsgi behind nginx (as a proxy) and you'll also see an increase in requests per second because you won't be tying up all that RAM as a client is downloading and only use the RAM while actually doing work in Python freeing that memory to be used elsewhere. However, I'm guessing that they ran this benchmark locally and so the network tying up Apache instances wasn't a factor.

Clearly, benchmarks are benchmarks, but it isn't as if Facebook and FriendFeed don't have lots of smart engineers and they wouldn't be using their own server if Apache was significantly better. They have a site of massive scale and know how to benchmark things since you need objective data to make decisions based on rather than the religious arguments some get into.

I'd guess that, if one could get Django running under Tornado, it would run similarly and that it's more of Apache being the bottleneck/RAM hog.

But if you're going to put nginx in front of it, why bother with Apache at all?

IApache's only saving grace is that you can handle everything monolithically (albeit with mediocrity).

and the way they handle request in non-blocking way still have many benefits over Django in real web application especially if you have to make many calls to "other" http server to fetch data which happens a lot in today's world.

The templating system they developed looks pretty sweet http://github.com/facebook/tornado/blob/master/tornado/templ....

Just played around with it a bit and it's basically what I want:

1) Simple and clear syntax (e.g. they use 'end' not endfor, endblock, etc)

2) Assign template variables to anything (including functions)

3) Don't over-restrict the author (e.g. they allow list comprehensions in if tags)

4) Block and extends statements.

Error handling seems to be a little clearer than other template languages though still not great:


Clearly it's not going to be for everyone but after slogging around with Mako for the last year (<%namespace:def /> tags anyone?) this is like a breath of fresh air.

Thanks! Yah, the reason we rolled our own is because all the rest were so restrictive to authors. We didn't want a template system telling us what we should and shouldn't do in templates, so ours was a very thin layer on top of what basically translates directly to Python. It is actually one of my favorite parts of the system.

It seems like blocking calls to data sources (the database, memcache, etc) would screw with a lot of the benefits of running the framework asynchronously. If your Tornado processes freeze while making synchronous backend requests, you're gonna need a lot of them, which probably kills a lot of the value.

Now all we need are async data clients that tie into the Tornado event loop, and a clean way to yield control back and forth (perhaps with coroutines - http://www.python.org/dev/peps/pep-0342/). Unfortunately, I don't think there are any async Python mysql clients (PHP has one now - http://www.scribd.com/doc/7588165/mysqlnd-Asynchronous-Queri...)

A web app written like that would run like it's on fire. The event loop would just roll through everything asynchronously. You could even automatically batch together data calls across http requests to make things easier on the data tier.

The browser-webserver delay resulting from long polling is much larger the frontend-backend delay. I think there is at least an order of magnitude difference there, hence an order of magnitude reduction is the amount of idle state kept on the server.

Yeah, I agree. For long-polling apps it seems like Tornado gives you a huge advantage.

For other web apps though, the frontend-backend delay is comparatively more important, because usually there's a load balancer or reverse proxy in charge of buffering up data arriving from distant/laggy browser connections and hitting the web application server with it all at once.

If you go with an evented architecture like this, you have to use it for pretty much all I/O or the whole thing falls apart. Lack of drivers and protocols has been a large barrier to adoption.

Who wants a (unofficial for now) Fedora rpm ? [0] I just hacked this together in half an hour. Time to sleep now.

[0] http://mapleoin.fedorapeople.org/pkgs/tornado/tornado-0.1-1....

Definitely rough around the edges, but looks like a great foundation.

A few notes: - template.py: "Error-reporting is currently... uh, interesting." - url mapping looks primitive - no form helpers - database.py looks decent, but is mysql only - Very nice to see the security considerations in signed cookies, auth, csrf protection.

Overall, it looks like they've done the trickier parts of building a web framework and left it to the user/community to add the parts web developers use most frequently (form & url helpers, code organization, and orm).

I love Facebook's enlightened approach to open source. If only one of their open source projects takes off, it will benefit them tremendously.

I implemented the same idea in Ruby event machine: http://gist.github.com/184760

Obviously missing features (auth, nicks, scrolling) but that can all be added in a few mins.

how about turning this to a full framework? or is there one already?

I'll brainstorm the idea. I have some ideas of how to turn this evented programming into a very natural flow using ruby 1.9.1's fibers.

I've been tinkering with continuation passing web frameworks in Ruby, but haven't had time to really dive into making one. There is plenty of inspirational material out there e.g. Seaside. If you get anywhere, myself and probably others in the community would be interested to hear about it.


This would be extremely interesting as this area is one major weakpoint for available Ruby frameworks.

What exactly does 'non-blocking' mean in the context of a webserver?

Holding the TCP connection open does not tie up all the resources of the request handling thread. That way a large number of inactive connections can stay open.

Thanks. So to clarify, this is just the same as what lighttpd phrases as a 'select()-/poll()-/epoll() based web server'?

It seems that the main advantage of this is that you have one thread manging many sockets. I am a bit surprised that blocking kthreads would be so much slower relatively. What causes the slowness? Context switching? Additional stack memory usage?

You guessed right. Context switching is expensive, and the default stack size default is in the megabyte range. So if you want to have 10K connections open, it takes about 10GB memory. You can of course shrink the stack size, but you have to measure your programs stack usage before doing that and it is quite cumbersome.

With an event driven architecture you only have to hold the session information per connection in memory which can be as low as 4K, therefore you are able to maintain (depending on the complexity of the protocoll) n*100K connections.

The only drawback that you have to write and think your whole program event driven; you write callbacks for each io and timer operation and the control flow won't be clear if you read the program.

I believe this is one of the advantages of 64 bit architectures. That meg of stack size is just a reservation, it's not actually used, unless your thread really needs it. On a 64 bit machine there is plenty of space to allocate.

Yeah, but why would you waste it for stack space, when you can do an event-driven design, and allocate the resources only when needed?

Anyways, I was talking about C, where the stack is allocated when the thread is created and you cannot resize it later, nor it will grow automatically.

With dynamic languages it is different, but don't expect handling that much connections with them either.

It doesn't solve the context _switching_ cost.

This just isn't a substantial cost any more. With the latest kernels, thread per connection servers are competitive with event driven servers. This was not true a few years ago.

You're discounting the output buffer that you will need on a per thread basis (unless you are serving from cache, in which case you can serve the data directly from the cache memory).

Typically an output buffer should be able to hold the complete production for a client, if you limit it to say 4K you will be unable to process a request in one go.

I don't think threads are slower or less scalable anymore. The OS threading systems have improved a lot over the last four years.

At least if you use a language like Erlang, I suppose. (Although I think that wouldn't be OS threading, but still).

Erlang threads (processes actually), are very different from OS threads. They are extremely lightweight and allow you to do this kind of event driven networking with threaded code.

Non-blocking sockets. Calls to socket.read() can return no data. It's non-blocking because the program is not stuck waiting for data to be returned.

It also implies you don't need to do threads. Blocking IO means you need a thread per client. Threads have an overhead.

This looks great. I'm likely to look at it a little more in the future. I've been using Second Life's asyncrounous coroutine library called eventlet. I've implemented most of my code with a good lightweight framework that fits; restish. On top of that, I use spawning to manage my server processes. I myself have seen these kinds of numbers with my own app tests.

I've been looking for a good framework to begin serious web development on for someone coming from a decade-long career in desktop development (see the discussion at http://news.ycombinator.com/item?id=808191), and decided this release co-incides nicely with my newly-started quest and gave it a shot.

I realize Tornado isn't any different from the other Python frameworks with regards to coding style, etc. but the the fact that this framework comes complete with a web server means I don't have to worry about that part of the equation making developing Python-based webapps almost identical to developing a C++ library with one of the many HTML-based UI frontends :)

Having a great time playing around with it... almost done with a basic forum system built on Tornado + Storm (https://storm.canonical.com/).. I think I'm getting the hang of this whole web-development thing! :D

For a real different coding style, more like a desktop development, take a look at Nagare (http://www.nagare.org), a continuation and components based web framework. And it also comes with an integrated HTTP server (and a fastCGI one) ;)

Does this implement all of the HTTP 1.0 and 1.1 protocol specs? I'm kinda confused by the code :S

It implements a lot of HTTP/1.1, but see http://www.tornadoweb.org/documentation#caveats-and-support. In practice, we run behind an nginx reverse proxy, so we assume there are missing areas. We recommend people run in a similar fashion in production. We did not optimize for protocol completeness given our production setup.

This is great for python, a proven web framework that was stressed on friend feed. Thanks ff team! I have been torn on my next python server project between web2py, cherrypy and django and I think I just decided.

For the record. They tested web.py not web2py. It is not the same thing and they are completely unrelated.

Oops that is what I get for commenting while in crunch mode at work. hangs head in shame

This is great. Any idea if they will do the same for the FriendFeed datastore?

As I recalled, FriendFeed uses MySQL as the "datastore" backend, but with very different table layout. Check out this article: http://bret.appspot.com/entry/how-friendfeed-uses-mysql

It's just MySQL.

I could use this in apps today if it used twisted instead of reinventing it.

How many deployed web apps really use Twisted? There are like 3 web packages in Twisted, most of which are really buggy, and as far as I could tell, barely used (even they acknowledge this, see http://twistedmatrix.com/trac/wiki/WebDevelopmentWithTwisted).

When we were developing this, we found that Twisted introduced as many problems as it solved in terms of incomplete features and bugs. The other protocols seem to have more attention than HTTP from what I could tell.

The other protocols seem to have more attention than HTTP from what I could tell.

I'm not sure even that is true. When I used twisted to write the justin.tv chat server (which is essentially an irc server) I gave up on twisted's irc protocol implementation and wound up just using twisted as an I/O layer.

Yah, that seems to be Twisted's problem. I think projects that don't have a real site using them day-to-day end up in this state often - lots of incomplete implementations of lots of features.

"How many deployed web apps really use Twisted?"

This is sort of the point. twisted isn't a web framework. It's a networking framework. I work on a lot of really awesome networking projects that either don't have web interfaces (various xmpp services) or have ones that need work (buildbot).

Instead of filling an obviously missing hole in an existing framework, a new one was created that is missing all of the stuff that the other project has.

The twisted http client stuff is quite good. I used it along with a friend to build a tool that is a realtime (in the web sense) gateway between friendfeed and xmpp clients a couple days after the realtime API was released (before friendfeed launched their own). It works very well and I still use it today.

Unfortunately, if I wanted to build a tool that did similar stuff with a Tornado front-end, I'd have to write a new HTTP client that's compatible with Tornado's event loop (unless I'm mistaken).

Everything comes at a cost. Tornado demonstrably solved Friendfeed's problems quite well, but, as I stated above, I can't use it to solve my problems because even if I rewrote my apps, I'd lose all of the rest of twisted upon which I'm relying. e.g. I can't just bolt it onto buildbot (which would be so ideal as our web interface is suffering from lack of a framwork, but the backend is really good).

Tornado ships with an async HTTP client as part of the core framework.

It does, but from reading the code, it looks like it's closer to to twisted's getPage functionality. That's OK for things like Friendfeed's realtime API where you do long polls that return chunks of data and then process the data online and loop.

It's not OK for things like twitter's realtime APIs that provide infinite streams of data.

However, even in the case of friendfeed's API, I process the data incrementally. It's not uncommon for users of my service to receive data via xmpp before the http request has even completed.

I also use this technique for twitter's non-realtime APIs -- I use a pull-based SAX parser (that comes with twisted) to incrementally process the stream and collect and pre-sort the interesting parts. Then one of the callbacks I attach to the completion of the request (note one of: I have several that are reusable components) delivers the pre-sorted, pre-filtered results out via xmpp. I did this to reduce memory used in my daemon. It really helped.

So while it has one, it doesn't appear to be full-featured enough to work in my applications. And this is the sort of cause of confusion here. Twisted has a great network stack and a ton of really awesome protocol implementations sitting on the shelf (my apps mix http client and servers, xmpp, couchdb, dns, finger, etc...). What it doesn't have is a decent web framework.

One of the twisted guys, however, said it should be possible to transplant the web framework part of tornado onto twisted, so that would be great for the rest of us.

What would be the advantage of using this over, say, lighttpd with one of the many existing python frameworks?

Does this web server offer any specific gains? Or is it just a question of personal taste?

At a first glance, this looks SERIOUSLY awesome!

Bret, quick question. Does the framework support an esi type tags to abstract out personalized info? thanks!

I am not sure what esi type tags are. Mind sending me a link?

Suffices to say, no, we do not support that :)

Edge Side Includes. Proxies like Squid and Varnish support it for inclusion of dynamic data in otherwise cache-able pages.


It seems like this would not be mandatory in Tornado, since it is an app server; from my reading I'm assuming all pages are dynamic.


ESI: http://www.akamai.com/html/support/esi.html

e.g. When you and I log into ff, simplifying greatly, that page would say, "Welcome, Bret" or "Welcome, Prakash". If you could abstract away "Bret" and "Prakash", that page would be cacheable. And, the db hit is only for that particular fragment.

You would need to re-code the pages with esi tags to do it. The other way would be to use javascript to do the same.

What do you think?

Is this similar to EventMachine in Ruby?

EM is a general-purpose library for nonblocking IO. Tornado is an HTTP server. In as much as they're both about nonblocking IO, they are similar.

Tornado has its own EventMachine-like equivalent as part of the stack they released. So you could use it to do TCP non-blocking IO if I grokked the code correctly.

I'd say it's a little more similar to Thin, which is a web server based on EM.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact