
Guido van Rossum Deconstructing Twisted's Deferreds - thezilch
https://groups.google.com/d/topic/python-tulip/ut4vTG-08k8/discussion
======
parennoob
I don't understand any of the ideas alluded to therein, but the tone of the
discourse in there and the thread that led to this
([https://groups.google.com/forum/#!msg/python-
tulip/EgpBV5-sI...](https://groups.google.com/forum/#!msg/python-
tulip/EgpBV5-sIQ4/hcTQwMmKFOUJ)) is very respectful, and people apologize if
their tone was too snarky or contemptuous.

I like this a lot about python, and it makes me excited and eager to perhaps
contribute a little to its development some day (I'm currently at the level
where I have a decent idea about the OOP aspects of python, and know some
tricks that I think are nifty, such as overloading __setattr__ and the like.
So quite a long way to go, but hopefully I will get there some day).

~~~
takluyver
You probably don't have as far to go as you think - the standard library is
largely written in straightforward Python. The limiting factor is more finding
a bug that no-one else is already fixing.

~~~
chrismonsanto
Could fix this one I submitted 3 years ago:
[http://bugs.python.org/issue9226](http://bugs.python.org/issue9226)

------
aidos
I've only looked at Twisted briefly before but I feel like I learnt a lot from
this run-through.

Guido writes so well; he is able to thoughtfully deconstruct ideas with a
critical eye while still praising the bits he likes. I think we could all
learn something from the way he has approached this. It's definitely a model I
will aspire to follow while appraising unfamiliar code.

------
thezilch
If you happen to not make it through the thread to the following link or would
prefer to join Guido and others discuss further edits to the post, before he
makes it into a blog or a contribution to Twisted docs, the following is a
Google Doc with several comment threads about quotes throughout the article:
[https://docs.google.com/document/d/10WOZgLQaYNpOrag-eTbUm-
JU...](https://docs.google.com/document/d/10WOZgLQaYNpOrag-eTbUm-
JUCCfdyfravZ4qSOQPg1M/edit)

------
thristian
I first came across Twisted Python and Deferreds years ago, and it was my
first introduction to the world of asynchronous programming. It was a little
bit difficult to wrap my head around the intricate braid of callbacks and
errbacks, but once I realised they mapped directly onto sequential statements
and try/except in synchronous code, I got a lot done.

In the intervening years, I've seen a lot of projects re-implementing
asynchronous programming (A perfectly valid idea, Twisted certainly has enough
backwards-compatibility warts to warrant an occasional rebuilding), but the
early attempts used bare callbacks for everything, which I found frustrating
because I felt Deferreds were _clearly_ a better system. Later attempts
introduced "Futures" or "Promises" which were _slightly_ better than bare
callbacks but did silly things like pass the same initial result to every
registered callback instead of being composable. These almost-but-not-quite-
entirely-unlike-Deferreds were even more infuriating, in a "somebody is wrong
on the Internet" kind of way.

I'm really glad to see Guido picking the best ideas out of Twisted's Deferreds
and trying to make them accessible to a professional-developer audience. If
your language/runtime doesn't support some kind of co-routines, then your only
hope for dealing with asynchronous code to tame callbacks, and Deferreds are
the best model I've come across for doing that.

(Guido mentions he won't be using Deferreds or Deferred-analogues in his Tulip
async framework, for undisclosed reasons. I'm betting those reasons are that
modern Python _does_ have workable co-routines, so callbacks and callback-
taming aren't necessary.)

~~~
lazyfunctor
I am new to this. Can you give me any pointers to differences between promises
and deferreds. I have some idea about promises in JS. [http://promises-
aplus.github.io/promises-spec/](http://promises-aplus.github.io/promises-
spec/) How is twisted's deferred different?

~~~
noelwelsh
I think this distinction should be framed in terms of Python's futures, not
futures per se. Python's futures
([http://docs.python.org/dev/library/concurrent.futures.html#c...](http://docs.python.org/dev/library/concurrent.futures.html#concurrent.futures.Future))
are very limited. They lack the equivalent of the _then_ method in Javascript
promises. This means there is very limited ability to compose futures in
Python.

Twisted's Deferred allow some kind of composition, though IMHO it is rather
messy.

You'll note in Guido's emails he says Python's futures are based on Java's.
Java's futures (I can say from experience) are a completely useless crock. All
you can do is poll them, which makes the whole point of asynchronous operation
moot
([http://docs.oracle.com/javase/6/docs/api/java/util/concurren...](http://docs.oracle.com/javase/6/docs/api/java/util/concurrent/Future.html))

Javascript's promises are a better implementation, as are those in Scala and
Haskell.

------
pkinsky
I'm probably missing something, but isn't _Idea 4: Chaining Deferreds_ just
flatMap over Futures? Scala may have spoiled me somewhat.

~~~
zeckalpha
And errbacks are really just the Maybe monad.

~~~
seanmcdirmid
None of these future/aync systems are very novel, but there is demand for
them. Eventually most of them will fail and a few of the good ones will
survive. And ya, they will resemble functional programming language X and Y in
some of their features, because they aren't meant to be novel.

------
annnnd
This bit sums my experience with Twisted nicely: "...[the available docs] are
either aimed at absolute beginners or at experienced Twisted users." I was
highly frustrated trying to find any useful docs on Twisted.

This is where I dumped Twisted for Tornado (and I'm quite happy about it too
;). Looking back Twisted seems over-engineered too, but that is strictly my
personal opinion and could be due to lack of understanding.

------
YZF
I made lots of micro-edits while thinking about this, let me try and combine
to something more cohesive... Also of note is that I use Twisted in my day to
day.

In common Twisted programming a deferred _can not_ fire before you add the
callback since the only way a deferred can fire (for typical async work!) is
through the proactor (which Twisted calls the reactor) and you haven't yet
returned control to the proactor. The bug comment seems to imply a more
fundamental issue but if Twisted had such a fundamental bug it just wouldn't
work.

I think Guido's point is that you can still potentially add callbacks to a
Deferred that has fired which opens an opportunity for bugs. He has a point
there but you don't usually use Deferred that way. Generally when you add
callbacks you know the Deferred hasn't fired yet.

I think Twisted isn't that bad but there are a few things I dislike about it:
It's harder than I'm used to to combine synchronous and asynchronous. Twisted
is "contagious".

Because the Twisted application "wants" to be single threaded, everything goes
through the reactor and therefore everything has to be pure async. You can
deviate from that but it has some amount of built in inflexibility. If I
compare with C++ Boost::Asio there's much better multi-threading support in
Asio with its worker threads (which is important for scaling across cores) and
you can do more interesting combinations of async/sync IMO.

Twisted's @inlinecallbacks decorator allows you to write linear looking code
you it also encourages a too-serialized way of doing things. The power of
asynchronous is to do things in parallel but in Twisted you often either end
up with very hard to follow "pure" asynch code or less efficient "chained"
asynch code. At least that's my experience.

I personally prefer having callbacks to the Deferred returns because very
often the only thing you do with a Deferred is add callbacks and it makes your
code messier, more clutter/boilerplate. Returning Deferred allows for the
"generator" style @inlinecallbacks which actually can be an anti-pattern from
a performance perspective and also many people find it confusing to read...
Explicitly chaining callbacks is what I prefer. To elaborate on callback vs.
Deferred you can relatively easily build more complex structures of any kind
you wish over the callback when you need it. In a sense it's the most basic
way of doing things. I like basic :-)

Related to combining async and sync styles you end up seeing things like
isinstance(ret, Deferred). There's also maybedeferred for dealing with async
vs. sync. Still not great.

~~~
rdtsc
> it's like Twisted is contagious. I can't quite put my finger on it.

I can. Twisted is viral (not in a good way always). Once you start using
Twisted you are doomed and stuck in a parallel reality of Twisted-only
libraries forever. I have used Twisted professionally for 4+ years, this is
not something I just read on a blog.

You first say "Oh, I know look at the cute example of parsing a line protocol
with it, let my try that. That was easy, I should adopt it". And then it got
you. You want to fetch data from a particular database. That's easy, github
has a project in Python that does it. Oh but you need it to return a deferred
now. You can't simply make a connection and block because it blocks the
reactor. This library is no good, you either have to pray someone who knows
Twisted wrote a Twisted version of it, or you have to write your own.

As you pointed out, you can use code to combine threaded and deferred code but
it is not fun or easy. Make sure you think twice before adopting it.

BTW the best way to learn it is to use it. Don't read about it. Reading about
it won't help much. It is one of those subjects, like riding a bike if you
don't know how. Once you get it you get it and it becomes easy. What is not
easy is library fragmentation.

You have been down-voted and perhaps I'll be too for my negativity. But I am
worried about infatuation with Twisted and Promises in recent (last year and
on) Python design emails and moving them more to be part of the standard
library.

I think this is a not healthy move for Python and a step forward. I'll speak
my mind and if you feel this is too negative, just downvote, I'll understand,
but I think it has to be said. They are copying what they are seeing in other
language hint: node.js (Javascript). I feel there is a bit of jealousy, Python
is maturing, its last exciting thing was 3.0. PyPy has been relegated to the
magic academic research corner (unfairly I think). But, node.js and Javascript
on the server is not successful because it has a callback style of
concurrency. There are other reasons. So copying that aspect won't make Python
more appealing. Heck, Python had an awesome callback library for ages --
Twisted, when Javascript was still use to make buttons flash and Java applets
were the hot thing. Did it take off? Somewhat, but not really. It was famously
hard to grok. The freshest breath of air so far in practical Python use on the
server is using greenlet. But that also has been pushed to sort of "ah it is
dirty, we don't like, how will you ever know if your code switches co-
routines". Well it is not perfect but it is most practical and performant to
date. I wish that became part of core Python. I wish they copied more from Go,
Rust or Erlang rather than a callback style concurrency with some extra sauce
on top. A

Anyway, ending my rant here, and hoping I didn't upset too many people
already.

~~~
ekimekim
Agreed. I make heavy use of greenlet/gevent, and one of the primary advantages
is that you can import and use naive sequential libraries, and they won't
block your process (assuming they're pure python so the monkey patching works
properly).

As for "not knowing when you might switch coroutine" \- I contend that if it
matters, you're doing it wrong. Note that, unlike proper threading, things
can't happen at the same time / interleaved with simple statements, so it's
still very easy to do atomic actions without special locking - eg. there's no
way self.x += 1 could switch in the middle (unless someone is badly abusing
the __add__ method).

~~~
lvh
The problem is still that other coroutines can mess with your shared state,
and you have no queues as to where they get to do that. Neither Twisted nor
tulip has that problem. The fact that it can't happen in __iadd__ is just one
place where it can't happen. That's a far cry from not having to care about
this problem.

~~~
rdtsc
> The problem is still that other coroutines can mess with your shared state,

I think you mis-understood his point. The point is if you treat your green
threads like regular threads and you properly release and acquire locks
("with" contexts help here too) then it won't happen. The other way is to use
queues for example.

In most small examples, say updating a shared dictionary or list, many won't
bother serializing green thread access to it. That should work fine.

> The problem is still that other co-routines can mess with your shared state,
> and you have no queues as to where they get to do that. Neither Twisted nor
> tulip has that problem.

greenlet based green threads dispatch underneath just like Twisted's reactor
does based on a select/poll/epoll/kqueue system call. Like it is possible to
update half the items in a dictionary, then call a function that does IO from
a green thread, and another green thread gets dispatched and now updates the
dictionary, so, now one thread sees a logically inconsistent shared piece of
data.

Now replace the above threads with callbacks. You have a TCP listening socket
(is it called a factory or protocol, I forgot my terminology already) that
dispatches callbacks when data arrives. In those callbacks you update some
part of the dictionary then realize you need data from a data base so you make
a database call, get a deferred and add a callback to continue updating when
data arrives back from database. In the meantime your original socket fires
and another callback goes to your shared dictionary and now sees it in an
inconsistent state.

This thing exists, I have used it a few number of times:

[https://twistedmatrix.com/documents/8.2.0/api/twisted.intern...](https://twistedmatrix.com/documents/8.2.0/api/twisted.internet.defer.DeferredSemaphore.html)

It is a semaphore in Twisted. It is also used to reduce concurrency but it can
be used to protecte shared data if needed. And I did need it in a couple of
places.

This myth that callback based programming frees one from the need to use locks
and somehow it magically infuses code with unicorns and easy shortcuts is very
common. The only way to dispell it to use the source. Just read the source of
twisted and see how it works. Look at greenlet and eventlet's hub's source (or
gevent) and see how it works.

Yes, greenlet + monkey patching libraries can hide IO dispatching points your
code. It is a valid downside. So to be safe you an just treat them as regular
threads. But it allows you to use libraries -- one of the biggest reason one
would pick Python for.

Now that we are on the subject, what is one sane way to drastically reduce the
need for locks and semaphores? Use actors like Erlang does. A class with a
thread and a queue attached to it. Then avoid sharing data instead send
messages. You'll be hit with a serious performance issue because of memory
copying, because well, there is no free lunch. Twisted and other callback
mechanisms are certainly not a free lunch either.

> ... and you have no queues as to where they get to do that.

Heh heh, an accidental funny typo. As actually if you did have queues, you
might not have to worry about shared data access ;-)

~~~
lvh
I don't see the parent suggest that you use treat coroutines like threads at
all. In fact, he's explicitly suggesting that the same issues that exist with
threads don't happen, or are at least significantly diminished.

I'll happily admit that if you treat coroutines like threads and lock where
you're supposed to, the failure modes are the same. However, since nothing
_appears_ to break if you get it wrong (synchronization-related failures that
only show up under load, and when they do, they show up by silently clobbering
some data), I suggest that there's way too much room for people to get it
wrong there. (My personal anecdotal evidence appears to corroborate this.)

Furthermore, in many cases there are simple ways you can make a synchronous
library look async in twisted: deferToThread is the most common/obvious one.

So, yes, you get all of that wonderful manual CSP stuff! I guess I just don't
agree with you that that's a benefit.

Can you give a practical example of where you used a semaphore to synchronize
shared mutable state access? I've only used it for, say, concurrency limiting,
I think.

I'm actually very interested in actor-based concurrency, and I think many of
my fellow Twisted developers with me (we have a good contingent of actor +
obj-capa fanboys). I have a few systems where the reactor is pretty much that
queue for all intents and purposes :)

~~~
rdtsc
>I don't see the parent suggest that you use treat coroutines like threads at
all. In fact, he's explicitly suggesting that the same issues that exist with
threads don't happen, or are at least significantly diminished.

Right, but by implication if you don't want them to happen and are afraid
someone will override __iadd__, then by no means just use a lock. Both
eventlet and gevent I believe will monkeypatch threading module to now become
"green" threading module.

> However, since nothing appears to break if you get it wrong

But nothing would appear to break in the deferreds case I suggested either. It
would still be accessing shared data in an inconsistent way.

> I suggest that there's way too much room for people to get it wrong there.

That is why a private queue with a green thread attached to it (a actor!) is a
better way to build large concurrent systems.

> Furthermore, in many cases there are simple ways you can make a synchronous
> library look async in twisted: deferToThread is the most common/obvious one.

Yes it is there and it works but it is clunky. After switching to gevent we,
for example, got about about a third to a half as much code as before. A lot
it was boiler plate code. inlineCallbacks and yields. Our own counter-parts to
already existing libraries to make a Twisted alternative.

> Can you give a practical example of where you used a semaphore to
> synchronize shared mutable state access? I've only used it for, say,
> concurrency limiting, I think

Pretty much the exact problem I described above. A server was processing
requests from the user. It was listening on a TCP socket and reading commands
delimited by a new line. One of the commands initiated a business operation
consisting of multiple steps. That operation involved going to some databases,
calling a number of sub-processes all of which was handled via callbacks. As
the operation was taking place, a local piece state data was updated, tracking
the progress etc.

Now if before that request finishes, a new command comes. And starts firing
the same sequence of callbacks, but the previous one hasn't finished yet.

So that's it. You are accessing a shared piece of data which is now in a
logically inconsistent state in respect to your internal business logic and
you have no idea it happened (just like with threads).

So you need to acquire a deferred lock and then release it when done.

> I'm actually very interested in actor-based concurrency, and I think many of
> my fellow Twisted developers with me (we have a good contingent of actor +
> obj-capa fanboys). I have a few systems where the reactor is pretty much
> that queue for all intents and purposes :)

Yeah we switched to that. Actors are basically an object attached to a green
thread that a main method. Actor has a queue. Outside processes can send it
messages by putting them in the queue.. Actor in main method can do
self.receive() for example to get messages off the queue. Basically what
Erlang does. You have choice if you want to duplicate data (make copies before
putting in queues) or use objects with locks to serialize access to them (say
if they are large and copies won't work).

------
hcarvalhoalves
OT: Isn't passing an object around that defines a series of callbacks to be
applied essentially a... monad?

~~~
jerf
No. There's too many differences in all sorts of directions for this to be a
useful way of approaching them.

~~~
hcarvalhoalves
Right, I'm stretching. For a moment I had a glimpse of it being the same
design pattern, since Deferreds are composable.

------
rkangel
What interested me is the discussion about naming (re-use of callback,
"errback", "Failure" etc.).

That sort of discussion quite often smacks of bike-shedding, but Guido's
points here made me release how clear and simple a lot of the naming in Python
is, and how much that adds to readability and ease of understanding of code.
Clearly a lot of thought along the lines of "this should be a verb" has gone
into the standard library and choice of language keywords, and this has spread
to code built on top. Without realising it, this is one of the things that has
drawn me to Python.

I'll stop taking it for granted now, pay attention to how it's done, and maybe
my own code will benefit.

------
dustingetz
Here is a much better explanation, in Javascript:
[http://domenic.me/2012/10/14/youre-missing-the-point-of-
prom...](http://domenic.me/2012/10/14/youre-missing-the-point-of-promises/)

------
robert-zaremba
Interesting thing is that when using geenlet + proper event framework (gevent
/ eventlet) you don't have all this issues about different callback styles /
methods / objects, and what is more important, IO libraries support (like DB
drivers). I know there are ways to use twisted drivers in tornado, or use
`deferToThread` - but they doesn't seems as pythonic as monkey patching +
python drivers (like with eventlet + <your pure python DB driver>)

It's worth nothing that PyPy has a great support for greenlets with his JIT.

------
eagsalazar2
Looks a lot like promises. Cancellation is missing from Q.js anyway.

One thing that always surprises me though is how people rightly are disgusted
by callbacks but think promises/deferreds/futures/etc are nice. They are
_nicer_ but doing a lot of async stuff stills sucks big time even using these
tools. You will write way more code and it will be way tricker than an
otherwise comparable synchronous app.

Obviously sometimes you have no choice but I think one should always strive to
minimize the async parts of your app as much as possible.

~~~
spartango
Promises, Deferreds, and Futures are all names for the same general idea, that
which uses a monad to stand in for the result of operation that may or may not
have been completed.

There are implementations of this concept for many languages, from Java's
futures to C# Tasks to Twisted Deferred to various JS Promise/Future
implementations.

One thing that I find amusing in the callbacks+/-futures discussion is Google
Guava's ListenableFutures ([http://code.google.com/p/guava-
libraries/wiki/ListenableFutu...](http://code.google.com/p/guava-
libraries/wiki/ListenableFutureExplained)):

The JDK ships with Futures under java.util.concurrent, and they do what you
would expect. What Guava adds is the ability to use Futures in the "pure"
sense (get/check result) _and_ with callbacks or delegates. The neat thing
here is that you can use whichever paradigm is appropriate for the design of
different modules of your system, and safely mix and match as needed.

Guava isn't the answer to all the questions in the Futures world, but I do
think it brings a neat idea to the table here.

------
Pxtl
The bit with passing the result along in the chain of deferrals reminds me a
bit of some of the fun stuff you can do with Events in C#.

~~~
Locke1689
Yes, although that's considered bad practice with async.

I previously used Twisted to implement a speculative model of new Internet
architectures for a graduate class when I was in my distributed systems phase
in college.

One thing that Guido mentions is that Twisted Deferred computations generally
rely on adding callbacks, such that you say add_this, add_that, etc. and all
of these callbacks are meant to be executed asynchronously, but sequentially.

My feeling about Twisted was that Deferred's are almost _too_ easy -- to the
point where I found myself returning a deferred to another function, adding
more callbacks, passing it to someone else to add more callbacks, etc.

At the end of the computation I felt like I had inadvertently brought back a
lot of the pain of callbacks -- it's often difficult to figure out what gets
called when and you have to look at a lot of places in the code to figure it
out.

In contrast, my recent experience with async is that it encourages having
asynchronous code that _looks_ synchronous. That alone feels like a benefit to
me, because it encourages the same grouping that one would normally do in
their code, which just seems harder to lose control of.

C# async is certainly capable of the same "spaghetti" nature of Deferred (or
callbacks in general, actually), but my experience has been that the structure
encourages the opposite.

------
jasonmoo
I think the entire go spec would fit into the same number of pages as Guido's
explanation of deferreds. _zing!_

~~~
iopq
I can write what's new and exciting about Go on the back of a napkin.

