Switched from Django to Tornado a few month ago, and I've never looked back since. In Tornado everything just seems to makes sense and nothing gets in your way. I've worked with Django for couple years and never felt this way.
It's not the non-blocking part that pulled me in, but simple APIs that make sense. In Django, I always feel that I'm spending my precious time to learn "the Django way", in Tornado, once you learn the few basic things, you can focus on doing things.
It's also wonderful that Tornado's codebase is still compact enough that you can easily understand and modify it, if it doesn't provide the functionality you need.
There's also Cyclone which I've used with great success for some intranet sites. Like it very much and definitely worth checking out IMHO:
Cyclone is a low-level network toolkit, which provides support for HTTP 1.1 in an API very similar to the one implemented by the Tornado web server - which was developed by FriendFeed and later released as open source / free software by Facebook.
Key differences between Cyclone and Tornado
* Cyclone is based on Twisted, hence it may be used as a webservice protocol for interconnection with any other protocol implemented in Twisted.
* Localization is based upon the standard Gettext instead of the CSV implementation in the original Tornado. Moreover, it supports pluralization exactly like Tornado does.
* It ships with an asynchronous HTTP client based on TwistedWeb, however, it's fully compatible with one provided by Tornado - which is based on PyCurl. (The HTTP server code is NOT based on TwistedWeb, for several reasons)
* Native support for XMLRPC and JsonRPC. (see the rpc demo)
* WebSocket protocol class is just like any other Twisted Protocol (i.e.: LineReceiver; see the websocket demo)
* Support for sending e-mail based on Twisted Mail, with authentication and TLS, plus an easy way to create plain text or HTML messages, and attachments. (see the e-mail demo)
* Built-in support for Redis, based on txredisapi. We usually need an in-memory caching server like memcache for web applications. However, we prefer redis over memcache because it supports more operations like pubsub, various data types like sets, hashes (python dict), and persistent storage. See the redis demo for details.
* Support for HTTP Authentication. See the authentication demo for details.
Advantages of being a Twisted Protocol
* Easy deployment of applications, using twistd.
* RDBM support via: twisted.enterprise.adbapi.
* NoSQL support for MongoDB (TxMongo) and Redis (TxRedisAPI).
* May combine many more functionality within the webserver: sending emails, communicating with message brokers, etc...
Lots of differences. node.js is an async development framework. Tornado is a synchronous development framework that is slowly building up asynchronous alternatives where the need is great enough until it eventually becomes twisted or people just deal with the things they can't do.
node.js is asynchronous from the ground up, Tornado is generally synchronous with the exception of http client and server. So, Tornado is less pure, but you can use any of the Python libraries as they are. Node.js is pure, even disk i/o is asynchronous, but there are much fewer libraries.
Tornado also wraps Facebook API and has authentication with Facebook, Google, Twitter and everyone else.
I see references to fixing problems on Windows, but there's no mention of Windows support or installation, nor does the hello-world example run (I get the same failed import on the fcntl module that I always have, even after upgrading).
Tornado is a high performance non-blocking web server. As it's non-blocking it can handle 1000s of simultaneous connections (see C10k problem) and is well made for long polling style applications (and in fact includes a chat application in the demos).
Cool. I think what threw me a little was the term "Non-blocking".
After some basic research it seems that means communications between client and server happen asynchronously? Apache and other blocking web servers must have a start and end for every file or stream and has to create a new instance (or fork the process?) to serve another client. Tornado does not require a connection with a client to complete in order to connect with a new client.
traditional web servers like apache 2, create processes to handle web applications that are developed in say ruby or python (e.g. phusion passenger, mod_wsgi). from what i understand (correct me if i'm wrong) these processes inherit large overhead in terms of having to load large numbers of libraries so memory usage is quite wasteful.
Shared libraries are just that: shared. They get memory-mapped, and the code is shared between processes. When a program has many threads open, something similar happens: the code is shared between the threads, but they get different stacks. Those stacks are a problem if you're spawning tens of thousands of threads, one for each connection you have open concurrently. You may find yourself running out of memory, or at least using way too much.
What Tornado does instead is serve a large number of connections from a single thread, using epoll to block on a group of threads until one or more are ready for I/O. This has much lower memory overhead, and does not require switching among threads.
This kind of I/O model is at its best when you have a huge number of connections, idle most of the time, and there are very few CPU-intensive operations. Long polling usually fits that description.
IIRC, another thing which keeps processes open is database access.
I'm not sure, but I think that with Apache + Django, you get something like: Apache gets request, gives it to Django, Django asks MySQL for a value, waits, waits, waits (taking up lots of memory, but not too much CPU, but your host charges for RAM), waits a little more, gets a value from MySQL, puts it in a template, hands the response to Apache, which then sends the response to the client.
Event driven development is the worst kind of programming because of the extensive use of callbacks and state maintenance/tracking. You never know the control flow by reading the source.
You have to maintain state all the time and keep track of inconsistent/impossible states.
That to me is not cleaner or easier.
I work in one of the big corps, and it never fails, event driven servers tend to be way more buggy.
Python out of the box doesn't have coroutines but you can always use gevent for that.
Python threads are real pthreads, now they are not optimal and should not be used in high load servers, because pthreads are not that light weight when you are running 10k of them. Creating servers based on python threads is as clean as coroutine threads but slower. Hence, coroutines are always favorable over them.
That's interesting. The conventional wisdom I've seen and read is that multi-threaded servers are generally more buggy and harder to reason about than using a single select loop with callbacks. Callbacks are more difficult to maintain state with, but they don't have race conditions and locking issues.
I agree that if you can get coroutines in your language, then perhaps that's the ideal path to take.