
Real-time applications and will Django adapt? - pramodliv1
http://arunrocks.com/real-time-applications-and-will-django-adapt-to-it/
======
nostrademons
(My background: I work for Google, I did a real-time web prototype using the
client libraries for GChat back in 2009 when real-time search was all the
rage, my Noogler mentor at Google was the frontend tech lead for the eventual
real-time search product we launched, and before Google I'd worked in
financial software, where real-time responsiveness really is required.)

I think that the folks currently building prototypes in Meteor dramatically
underestimate the difficulty of scaling up real-time software to production-
grade quality.

The problem is that if a _single_ component in your stack blocks, you are no
longer real-time. Any time one client writes into the database and another
reads it, you have to poll, since the DB won't give you notifications.
(Exception: PostGres gives you PQnotifies, Oracle gives you the User Messaging
Service, MySQL it's theoretically possible with triggers and user-defined
stored procedures that make a network call, and MongoDB you can break the DB
abstraction and tail the oplog. Good luck plumbing any of these up through
your language DB driver and ORM, though.) If you have business logic in a
middle-tier server that's request-response only, then that logic becomes a
synchronization bottleneck, and you have to constantly update that server and
poll it with requests. If your algorithms require complete state snapshots,
you're out of luck unless you build a service to manage and update that state
consistently while triggering the algorithms whenever it changes. If your
algorithms can't run in soft-realtime time guarantees (dozens to hundreds of
milliseconds, usually), you're still out of luck. You need to figure out
sharding of state and message notifications yourself. You need to figure out
message recovery protocols - most real-time systems have odd consistency
problems when messages get dropped due to overload, network failures, or
software errors.

Google's real-time search ended up polling every 15 seconds with simple AJAX
calls, because when the lag for a post to go through the indexing & serving
pipeline is a minute or two (itself a major accomplishment), an additional 15
seconds isn't going to be noticeable to the user.

People on HN love to hate on Twitter engineering, but one thing they've done
really well is scale a system that actually is soft real-time and has a lot of
potential producers and consumers. This is far from the trivial exercise that
someone who picked up Meteor in a weekend might think it is.

~~~
siliconc0w
Django's event system lets you avoid polling the DB.

Here is a plugin that offers some real-time capability (through a separate
process of course): [http://telegraphy.readthedocs.org/en/latest/django-
telegraph...](http://telegraphy.readthedocs.org/en/latest/django-
telegraphy.html)

Seems to be the best thing on offer that is off the shelf but you may be
better rolling your own node, websocket, redis pub/sub, django integration
yourself(which actually isn't that hard to do and may give you better
flexibility).

~~~
est
> Django's event system lets you avoid polling the DB.

Django's signal system must be run on the _same instance_ , synchronously,
before returning any response to client. This means if a signal blocks, the
response will never reach the client.

It's technically impossible to handle post-response events in WSGI.

[http://dirtsimple.org/2011/07/wsgi-is-dead-long-live-wsgi-
li...](http://dirtsimple.org/2011/07/wsgi-is-dead-long-live-wsgi-lite.html)

For example, user requests a blog post, you can't first return response then
increase your view counter by 1. You just can't. Unless you use some container
specific hooks (like uWSGI, Tornado, etc.)

~~~
natrius
Django's signals API is synchronous, but writing a wrapper to process signals
as asynchronous tasks is relatively straightforward.

[https://github.com/nyergler/async-signals](https://github.com/nyergler/async-
signals)

~~~
est
> Async Signals uses Celery_ for signal routing

Since Celery is just a bunch of worker processes, why can't we build a web
framework which natively supports both web worker and bg task worker?

~~~
natrius
Sounds good to me.

------
programminggeek
There seems to be a meme going around that things like Rails or Django need to
somehow change and react to single page javascript web apps.

Maybe it's just me, but trying to modify your favorite web app framework to
accommodate something they were never designed to do in the first place is
foolish and will end up ruining what was originally great about tools like
Django in the first place.

Just because a hammer is a popular tool that you really like doesn't mean it
needs to change into ladder when you decide you need to climb onto a roof.

~~~
thatthatis
Rails/django are built to build websites. Websites are changing towards being
JavaScript in the client single page apps. Thus either django/rails changes or
gets removed.

I'm currently architecting a new app, and my django layer is still crucial
for: API access to the data, Auth & Auth, background processing.

What we are trying to do as website builders has changed, and thus we are at a
turning point. It isnt obvious yet what the go-to stack of the future is going
to look like - is it django + tastypie + angular, or rails + ember, or meteor
or something else?

Django was great for the old way of doing things (static or Ajax enhanced
web). But it's not clear what it's role should be in the future.

To use your analogy: this is people trying to figure out if they still need
hammers now that we're starting to use screws as fasteners.

~~~
Grue3
Another option is that websites that rely entirely on Javascript get removed.
I prefer this one.

~~~
thatthatis
I just don't see that as a likely future.

------
technel
I don't see the value proposition of making (most) web apps/sites real-time.
Sure, it makes sense for a chat app or a stock ticker, but blogging? A news
site? E-commerce?

Maybe it's important that eBay is "real time" in the last 5 minutes of an
auction, but the rest of the time, the vast majority of the content is
relatively static. A seller might update the description of a listing a couple
times over a two week auction, for example. And while it sounds great to
immediately update my search results when a new listing goes live, in reality,
I already have 40 pages of results to look through, and that listing that just
went live 5 seconds ago probably isn't much more relevant than any of the
others I'm sifting through.

I'm not opposed to client-heavy apps where it makes sense. When done well, it
can create a really responsive user experience. Gmail is great at this; I have
no desire for it to be "real time" \-- not any more than it already is.

Do we really believe that one day cnn.com will be "real-time", with article
updates and errata popping up inline as we read?

~~~
gojomo
It's not that everything must be real-time. But, the stuff that doesn't need
it has already been well-done for over a decade. The frontier of new
possibilities (including as incremental enhancement to the old categories)
tends to involve what's enabled by real-time.

For example, sprinkling in a little real-time surprise – like a notification
that others have already responded to your recent work – can accelerate
valuable interactions.

For example, in 'blogging' and 'news', both the original authors and active
commenters appreciate no-reload indications of fresh comments, mentions, and
inlinks. You can do a site without that – but you'll be missing out on
features that users increasingly expect, and work to create new interesting
content and engagement.

In 'e-commerce', a client-pulled site works and is well-understood, but adding
live sales help, or indicators of limited deals being exhausted, can help
close sales... so why not try it?

Even where the major cores of these markets work fine without real-time, the
frontier of exploration and optimization uses greater game-like liveliness.

------
al2o3cr
Maybe it's just me, but I find the simultaneous popularity of "only check your
email 4 times a day" and "OMG ALL WEB APPZ MUST BE REALTIME" slightly
peculiar.

~~~
marcosdumay
To tell you the truth, I don't normaly use real time web apps at all. But I
have an urge into turning what I write into real time apps, and no good
explanation why, it just feels that they become much easier to use.

Maybe I (and everybody else) only have the wrong impression. It happens, and I
don't have enough data to conclude anything.

------
RussianCow
Python in general doesn't really have a good solution for this, so it's not
something specific to Django. I run a Python web app that has certain real-
time needs, and I had to forgo a popular web framework like Django so that I
could use Twisted. The problem with solutions like this is that since the
language doesn't have built-in support for asynchronous IO, everything has to
be compatible with the library of your choice (whether that's Twisted, Gevent,
or other), and at that point, you'd be better off just using a different
language/runtime like Node.js or Erlang.

I think the current solution is to have Django serve the main app and have a
separate "API server" that runs Node or whatever, but as the article points
out, you're not really even using Django at that point because all it's doing
is serving up a single HTML page--the rest is handled by the browser and the
API server.

~~~
nostrademons
Python 3.4 may help a lot with the language mechanisms (with asyncio,
pluggable event loops, and composable generators everywhere), but there's
still the issue of getting library support to use all of that.

Node.js isn't actually better - it uses the callback model of async
programming, which should be familiar to any C++ programmer who's been writing
servers since the 80s, both because it's the current best solution for writing
scalable event-driven servers and because it sucks.

For ease of programming a CSP-based language like Go or Erlang is really the
way to go, but then you're back to the "lack of library support" problem that
you'd get with Python 3.4, except worse because Python at least has libraries
for the synchronous part of the computation.

~~~
Offler
Good news as you can now use ES6 generators in Node.js and when combined with
another ES6 feature, Promises it can result in much nicer async code e.g.
[http://taskjs.org/](http://taskjs.org/)

ES6 is looking very nice indeed, great thing is that with Node you won't have
to worry about old browser support.

~~~
est
I never understand the `function *(){}` syntax. Why can't we simply introduce
a new keyword like `generator (){}`?

~~~
netghost
Or just recognize that the function contains the yield keyword. That said,
`function*` isn't all that bad given what you get in return.

------
Kiro
I think very few sites actually need to be SPA at all. Just because an
e-commerce site has a real-time component doesn't mean it must be built in
Meteor.

E-commerce sites are in fact a prime example of something that I think should
be built using traditional technologies. Do you want price updates? Just poll
them with AJAX and let the rest of the site remain static. It's far from a
multiplayer game we're talking about.

------
secstate
I don't think I really understand the limitations we're talking about. No you
wouldn't ever want to write an app that had real-time elements in pure Django,
but isn't that what Celery is for? I bet with a solid messaging queue and good
architecture you could write a pretty convincing real-time app using Django as
not much more than a REST api to celery tasks and the database (and really,
that abstraction is what a framework is for anyway).

Besides, this sky-is-falling nonsense around frameworks is getting old. A
framework either lives or dies. Django has a very healthy community around it
and they are doing a great job right now of keeping the framework stable so
folks who "just need to get work done" can get work done. There haven't been a
lot of revolutions, and that's fine for me. Believe it or not, there's still a
market for content-heavy, traditional MVC websites. And when you need to add
real-time elements, Django, Celery and Django REST Framework are up to the
task a vast majority of the time.

------
sheng
Another real time application issue that rarely gets any attention is WebRTC.
I wish people would start tackling these issues for python/django, too. As of
writing this I don't know about any library that would allow me to write a
server application in python that would serve as a peer in a WebRTC session.
The benefit would be unreliable real time data channels to the server. This
can be of great use for games. Of course there are many different use cases.

------
falcolas
Aside from an inability to run websockets on Django, I've been running "real
time" websites for quite some time. AJAX calls are dirt simple to handle with
your typical Django setup.

Scaling and blocking are handled pretty easily by running Django on FCGI using
Flup and a Nginx frontend. No blocking problems since they're running in
processes and threads, redis for caching and pub/sub, and a database for the
backend. Works a charm.

Now then, this isn't a high volume site, getting only in the medium hundreds
of requests per minute, but it's been working without problems on a small AWS
instance. DB backups take more CPU than Django ever has.

Websockets, on the other hand, took me over to Go. Certainly not giving up
Django for the rest of the site, however, until it really can't handle the
load anymore.

~~~
rbanffy
Wouldn't a websocket middleware solve the issue? Client starts a socket and
passes the id through HTTP to the Django app. When something happens in
Django, the event is piped through the previously created socket and the
problem is neatly solved. Could even have some sophisticated publish/subscribe
mechanics in here.

~~~
falcolas
There is a bit of middleware out there which enables websockets, but:

1) Doesn't currently work with Django 1.6+

2) Websockets and WSGI don't mix well. It's possible to capture the socket and
use it further up the stack, but it requires some really nasty hacks.

3) Requires you to use a custom version of runserver, which allows the raw
socket to be passed up into the handler code.

Not worth it to me. I'm sure the code could be made to work again, but then
you loose a lot of the benefits of running it behind something like fcgi
(since you need access to the socket for the persistent two way
communication).

~~~
rbanffy
I wasn't thinking about Django middleware but a more generic message routing
engine running as a separate process that speaks websockets on one end and has
a bi-directional REST interface on the other exchanging messages with the
Django backend.

edit: karneges points out Hookbox and Pushpin can fill this role.

~~~
falcolas
Interesting idea, but you still end up with something performing polling
against the backend, if you want the ability to send messages to the client
without receiving a request first (the real strength of websockets).

~~~
jkarneges
Well, you always need the client to reach out first, since it is not otherwise
addressable. ;) But this doesn't mean it has to poll. With a separate gateway
that supports Websockets or Server Sent Events, it should be enough for the
client to make an initial request to bootstrap the connection, and then the
server can send as many messages as it wants downstream.

~~~
falcolas
> it should be enough for the client to make an initial request to bootstrap
> the connection, and then the server can send as many messages as it wants
> downstream.

Yes, that's the point of websockets; but what was discussed here is a program
acting as a bridge between django and websockets using http to speak to
django. http does not support such asynchronous communication. You get one
response for one request. How can a backend send a new response if there is no
open request from the intermediary?

Sure, you can hack http by refusing to close the stream of a response and
sending data intermittently (well, if you're using a django->apache/nginx
protocol that supports streaming responses), but then you're no longer
speaking http; you're speaking your own protocol over http.

Sure, you can have your intermediary poll django, which reduces some of the
overhead since you're bypassing the external network stack, but you're still
relying on only sending messages every $poll_interval.

Sure, you can create some secondary process through manage.py that runs and
communicates with the intermediary directly, but then you're no longer
speaking http to Django.

If you want async communication between your client and your server using
websockets, you can't rely on speaking http to anybody; it just isn't
compatible with truly asynchronous websocket communication.

~~~
rbanffy
The HTTP site of the middleware server should be able to make and receive HTTP
requests. If something causes an event in Django, it can make a request to the
HTTP server side of the middleware and send a message to all listening
websocket clients.

~~~
falcolas
Aaah, I think my confusion was from your misuse of the term middleware. When
used in the context of Django, middleware is a layer in a stack of WSGI calls,
not a standalone daemon which accepts and receives http posts and websockets.

I could see such a standalone daemon working, but it would seem like more
straightforward to just write a daemon which handles websockets and your
application logic on its own.

------
rartichoke
I don't know about Django but rails has the idea of "live controllers".

Sure it uses polling but didn't you watch DHH's railsconf presentation? They
have 5-6 workers and a single redis server which sustains 100k+ reqs per
minute.

It also only took DHH 4 hours to convert the entire basecamp project to be
live (ie. live updating comments as it comes in).

Sure it's not really live since the polling is only happening every few
seconds but who cares? Even for most chat systems it's completely reasonable
to do polling, most certainly if it's 1:1 chat.

Also look at Disqus. They are mostly all django, they even use postgres with a
schema. Their "real time comment system pusher" was written in Go in a week
with almost no prior knowledge to Go. I see nothing wrong with that and IMO
it's exactly what we should be doing.

Use Django/Rails for the bulk of your app, CRUD interfaces, etc. and then
create optimized services with Go or some other language for real-time
aspects.

[*] Everything I mentioned is documented online through talks, engineering
blogs, etc..

------
bayesianhorse
I have written "realtime" web applications in Django, using Tornado for
websockets (or their emulations). While a pure realtime non-blocking solution
might be able to squeeze out a lot of more performance, it's certainly
possible.

Realtime web applications require a choreography of communication between
server and client, with an unpredictable user and network messing stuff up all
the time. Like much of web development it comes down to not going crazy.
Otherwise we would be writing web applications in C++ or Java, wouldn't we?

------
FZambia
I don't see nothing wrong with separate asynchronous server which handles
real-time for your Django site.

When event generated by user happens on your site - you just handle it in a
traditional manner i.e. - POST via AJAX, validate, save if necessary and then
publish into asynchronous server which broadcasts event to all connected
clients. In this way you have a graceful fallback in case of async server
downtime, so your user doesn't even notice something went wrong. You are not
mixing things which were not developed to be mixed. In this case you are just
writing your site as usual and then add real-time elements where necessary.

Using Gevent together with Django seems like monkey patching entire web site
to me.

I really respect the work of guys developing uWSGI. But at moment it does not
seem to be usable in a simple obvious way. Maybe in future their real-time
support will become mature and convenient enough.

Of course, Meteor and Derby like approach is another level of problem
solution. But in context of Django I don't think we should consider them as
examples. We use python, not javascript - we have no native solution for
browser environment and I personally think we do not even need it.

------
Choronzon
The best way I found to do this for python is by using Tornado,you have an
excellent websocket implementation baked in and a scheduler within the
webserver itself so its simple to poll for changes and update only when
necessary,or interleave with a call back if you want "true" real time. Plug in
a front end with angular/knockout etc,pass around json objects and you are
good.

As far as meteor/node goes,having the same language on the server client is
great. Having javascript as that language is not so great. Web apps are
generally a front end to something bigger and I never want to do any serious
data wrangling in javascript if I can avoid it.

------
jkarneges
You can use Pushpin in front of Django (or any web framework, whether event-
driven or not) to implement realtime features.

[http://blog.fanout.io/2013/04/09/an-http-reverse-proxy-
for-r...](http://blog.fanout.io/2013/04/09/an-http-reverse-proxy-for-
realtime/)

The thesis behind this architecture is that most realtime web applications can
be reduced to request/response and publish/subscribe messaging patterns.
Instead of looking at Django as a legacy framework, look at it as 50% of the
solution (read: request/response). Pushpin provides the rest.

~~~
leephillips
This looks quite interesting; thanks for sharing.

------
clubhi
I disagree that server side templates are no longer needed. Templates are
often reused for things like sending emails or exporting to PDF. Sure, you
could use a JavaScript server side template to do this.

------
glynjackson
The only issue I see with Django is websockets. Apart from that I have been
using Django to build 'real time' web apps for years (AJAX). Django does
server side very well, AngularJS does client site well, mix in django-angular
and I have most of what I need. websockets django-websocket-redis.

------
d0m
I had the same problems.. I love Django and I want to use it for my real-time
application but I just couldn't find a way to make it work. I've chosen to use
node/angular/firebase instead and I'm very happy with my choice so far.

~~~
scardine
I use Angular with Django-Rest-Framework and I love it.

~~~
d0m
Hey, I wanted to pick your brain about it but I can't find your e-mail or a
way to reach you. Would be awesome if you can contact me (phzbox at gmail)

------
dobbsbob
localbitcoins.com is using django. Start a trade and messages are real time
without needing to reload a page

~~~
mcantelon
Probably ajax, which is more resource intensive/laggy than WebSocket-y and
isn't bi-directional (you've got to poll with ajax if you want to push changes
from the server).

~~~
wcummings
Ajax is Good Enough for this use case (and probably a lot of others), imo.
Even polling every 10s is plenty for messaging on a site like localbitcoins.

