
Async Views in Django 3.1 - rohitsohlot
https://testdriven.io/blog/django-async-views/
======
throwaway13337
I recently moved some code base from py/django to c#/aspdotnetcore.

C# probably has some of the best async primitives of any language. I still
managed to be bitten by the very common issue of error logging getting lost in
a hidden callback that doesn't look like a callback because of the magic of
the async keyword (a common issue in node as well).

I'm convinced that the async style of web servers is largely a step in the
wrong direction. A leaky abstraction that we don't need.

A better model already exists - thread per connection - for the majority of
what we want as web developers. There are some good things about allowing us
to manage our own waits but those things are rare.

It reminds me of the choice of when to manage your own memory - usually we
don't need to, but sometimes we do. Most languages programmers use today
decided, rightly, that memory management was an exception and not the norm. We
are more productive today for the reduction in cognitive load for most
applications as we decide not to manage our own memory.

Thread per connection has gotten a bad name for performance which is pretty
false. You can create the exact same underlying abstraction with green threads
for more concurrency when you absolutely need it (but mostly you don't). The
code we get from not using async is reduced in complexity substantially.

I'm disappointed that django is on the async bandwagon. It means that I may
have to make another framework choice in the future.

~~~
nemothekid
Anything that spawns an unbounded amount of POSIX threads will be bad for
performance, so you need an M:N scheduler to execute threads. I feel like this
is pretty hard to retrofit onto an existing language (e.g. I don't think you
can write a pre-empting scheduler in pure Python as a library either).

AFAIK, only Go does this well (with great results) as a result of it being
pretty much baked into the language. I don't think it's an async bandwagon, I
think async/await is the best way to get concurrency without requiring the
whole language to opt-in (as the Rust devs discovered).

~~~
YorickPeterse
I think it's not necessarily POSIX threads related. In theory there is nothing
wrong with spawning tens/hundreds of thousands OS threads, and Linux does a
decent job at this. But in practise you'll run into issues such as:

* Different OS' handle things differently. macOS limits the number of threads per process (somewhere around 1500 I believe). Other OS' may be slower, or impose other limits

* Because of this, sometimes spawning threads is pretty fast (e.g. 10 µsec). Other times it's super slow (I've seen threads take over 500 ms to spawn)

* Context switches are still pretty slow across OS'. There's finally some work being done towards lightweight thread support in Linux, but I suspect it will take a few years to become useful

I _really_ hope that one day we _can_ "just" spawn 100 000 OS threads without
issues, as it would make many concurrency problems easier to solve. Sadly, I
think it will take another 10-15 years.

~~~
nemothekid
_In theory there is nothing wrong with spawning tens /hundreds of thousands OS
threads, and Linux does a decent job at this. But in practise you'll run into
issues such as:_

The other big problem on Linux is stack sizes - "hundreds" or "thousands" of
threads isn't a problem - the C10K problem which proved the dominance of
nginx's evented model vs. Apache thread-per-conncetion model was broken 20
years ago. Where it matters is when we reach 100s of thousands or millions of
connections on a single node. The first issue you run into is the thread stack
size - Linux has a hard default of 8MB; at 100,000 you start to hit OOM issues
first. Go (and BEAM) has a ton of work to make this use case possible within
the runtime.

What I'm unsure about if Linux even cares about solving this problem - the
threading model is pretty general and I'm not sure how important high
performance connection handling is. At the end of the day, if you are really
I/O bound, an async style, co-operative, event queue will always be more
performant than a generic threaded scheduler like the one Linux has; meaning
if you care about performance (like Django probably does) they would probably
go with an async style scheduler even if Linux could support a million
threads.

~~~
YorickPeterse
> The first issue you run into is the thread stack size - Linux has a hard
> default of 8MB; at 100,000 you start to hit OOM issues first. Go (and BEAM)
> has a ton of work to make this use case possible within the runtime.

This is not how thread stack sizes work. Thread stack sizes allocate a certain
amount of _virtual_ memory. This means that with a stack size of 8MB you don't
actually immediately use 8MB of physical memory. Only when you start filling
up that memory do you start using physical memory. We can easily demonstrate
this using a simple Rust program:

    
    
        use std::thread;
        use std::time::Duration;
    
        fn main() {
            let mut handles = Vec::with_capacity(10_000);
    
            for _ in 0..handles.capacity() {
                let handle = thread::spawn(|| thread::sleep(Duration::from_secs(1)));
    
                handles.push(handle);
            }
    
            for handle in handles {
                handle.join().unwrap();
            }
        }
    

We when compile this in release mode, and run it as follows:

    
    
        /usr/bin/time -f '%M KB RSS' program-name
    

On my Linux desktop this prints out:

    
    
        88492 KB RSS
    

This is roughly 86 MB or RSS memory being used. If the thread stacks would use
physical memory, this would translate to just under 80 GB; and the program
would have been OOM killed because I only have 16 GB of RAM available.

The virtual memory limits in turn are not much of an issue. I think on most 64
bits systems the virtual memory limit is 256 TB. A 256 TB limit would allow up
to 33 554 432 OS threads with a stack size of 8MB. And since you can configure
the limit when spawning threads, you can increase this number by spawning
threads with a smaller limit (e.g. 4MB).

------
holler
For anyone looking for an alternative python async web framework, I highly
recommend Starlette
([https://github.com/encode/starlette](https://github.com/encode/starlette)).

It was created by the same guy who wrote the excellent & highly popular
Django-Rest-Framework. Very lightweight, clean, well-documented, and a breeze
to get running with. It also supports async tasks natively.

I spent years working with Django and now I'm currently using it to power the
api for [https://sqwok.im](https://sqwok.im).

~~~
jordic
We are also using starlette and works quite well.. in combination with
asyncpg..

~~~
toinbis
Do you use any db connection wrappers around asyncpg? Like encode/databases or
gino? Also important to note that SQLAlchemy 1.4 alpha with asyncpg support is
realease and can be tried out(both SQLAlchemy core&orm modes support it). I
myself evaluated all options of async postgress connection wrappers in fastapi
project but reverted to... just using sync psycopg2 through sqla core. No very
specific gains of async db connection for my use cases and sqlalchemy
creator's comments on immaturity of async postgres drivers made my stick to
what's battletested.

~~~
Sholmesy
We use
[https://github.com/samuelcolvin/buildpg](https://github.com/samuelcolvin/buildpg),
which is built by the same dude that built pydantic, which is also used in
FastAPI.

It's neat, it's a query builder, not an ORM or any nonsense, and it makes
queries a lot more manageable (bringing SQL logic into python primitives).

~~~
jordic
Nice one, but.. I prefer sqls that you can debug without rendering.. (it's not
so manageable when you deal with ctes, window functions, joins different
schemas, custom functions...

------
zests
I've done asynchronous programming with the reactive paradigm where I specify
a chain of function calls I want applied to data and it happens in a thread
independent way. I have not done much programming with the new fancy
"async/await" keywords. It seems like a worse abstraction because it mixes
asynchronous code with synchronous code and is probably harder to reason
about. Not to mention you can't do something like application wide
backpressure unless everything is asynchronous.

Can anyone who has worked in both types of applications comment on their
differences?

~~~
dtech
I've done both, but 80% in reactive streams. I like async/await before it has
a lot less hard stuff to think about - hot and cold streams take months if not
years for devs to grok - and creates code a lot more similar to non-async
code. In contrast Reactive Streams requires you to change your complete
program.

I like to compare it to Typescript/Kotlin-style null-safety v.s. the Java
Optional or Scala Option. If you want make code null safe with option you have
to rewrite basically your complete code from "procedural" to chains of
(flat)maps. In contrast in Typescript you just change T to T? in a few places
and add an if where you want to handle the optionality.

async/await versus reactive streams has this same problem, reactive streams is
highly impactful, much more than async/await. In practice I've seen this have
the effect of turning applications turn into a small reactive stream outer
shell with the app itself being mostly synchronouse/Future, while async/await
is used a lot more liberally because the impact is smaller.

> It seems like a worse abstraction because it mixes asynchronous code with
> synchronous code

It doesn't, at least in Kotlin only an async function can call an async
function. This isn't different from using `map` in a stream.

------
stickyricky
> Writing asynchronous code gives you the ability to speed up your application
> [...]

Citation needed... It will be slower by definition but it probably advertised
higher throughput anyway. I wonder how many people the throughput question is
true for. You’d have to be pretty hardware constrained.

They even suggest asyncio could be a replacement for celery... It really is
the Wild West out here.

~~~
nickjj
> They even suggest asyncio could be a replacement for celery... It really is
> the Wild West out here.

Yeah this is pretty questionable.

Having state in your app's process and then having to pretty much re-invent
queues, job cancellation, uniqueness guarantees, scheduled jobs, periodic jobs
on an interval, queue draining and retries with exponential backoffs.

I mean, yeah you could code all of that with enough time but why would you
when Celery exists. If anything having the queue's state held in Redis or
anywhere outside of your app's process is enough of a reason to pretty much
never replace Celery with asyncio. Even in the post's example of sending an
email I would still want a lot of what Celery has to offer and wouldn't
consider replacing it.

IMO Celery isn't going anywhere soon and I would still use it in every Python
project that needs background processing.

~~~
sbelskie
Just based on my C# experience, this sounds like a terrible idea (abandoning a
real task queue for Async/await). 90% of the “gotchas” with async/await in C#
comes from people thinking that the task scheduler is bonafide background
queue.

------
Epskampie
I'm very familiar with symfony, but not so much with django or other
"persistent in memory" web frameworks. Can someone ELI5 what the advantage is
to async django? Does it mean the server doesn't block on one request before
handling the next?

~~~
fiedzia
Yes. Typically to serve several requests at the same time, you'd run multiple
processes, each handling an at a time, and if you spend a lot of time waiting
for io (database, some service call or slow client), you'll need to use a lot
of workers (each being independent python+django instance, so its a lot of
memory -> significant investment in hardware). Async is nothing new in Python
world, but previously you'd have to resort to pure python to use it, now you
have whole django ecosystem that will use it. So main benefit is that you can
handle more requests on the same django instance, and pay less for hardware.

~~~
jordic
In our use case, from 12 cores to 2 :) (mostly for ha)

------
kabacha
I really don't get this anti-async mentality on HN. Maybe I'm doing something
wrong but async is a complete game changer for me as there just so much IO
wait in every project I touch. The fact that my program has to stop and wait
for something is honestly unacceptable and it's just a logical conclusion to
eliminate this blockage.

I worked past 2 years with asyncio and I have to say it still kinda sucks. The
syntax is overly verbose and asyncio itself can't decide what it wants to be.
However it still works _good_ for the most part and there's an enormous
community and effort in this!

~~~
BozeWolf
“So much io wait”. What io is your app waiting for? Is it a webserver, or
another type of app?

~~~
naters
Not the GP, but I'm currently working on a small web app that synchronizes
data from 2 other web services, which means that almost every endpoint
involves a call to an external API. In some cases, it involves repeated calls
to those APIs, but the order in which the responses are received is
unimportant, which is an ideal use case for asyncio's gather function. Using a
framework (I haven't tried Django's new async views yet, but am using one of
the handful of Flask-ish async frameworks) that supports async views has made
this a snap, whereas before I would probably have used some sort of out-of-
process work queue type setup.

------
agumonkey
somehow the idea reminds me of continuations based responses (HN used to be
wired that way, also Queinnec wrote lisp papers on the topic)

------
esseti
is there a list of real use cases where the async is better than the sync. it
seems that it's ok in very edge casees where there's high load and a lot of
I/O (example, calling external APIs, calling a db).

however, is it stable enough and roboust to replace sync methods in production
for the benefit of these cases?

------
jordic
With async Django views we can use Django with asyncpg... o_o but middlewares
are still synch...

~~~
zaro
Yeah, it will be some more years before you can simply write async stuff in
Django and not worry about every single library that you use, whether it
supports async or not.

That's why some years ago I gave up on Django altogether and moved to Nodejs.
Much better experience writing async code.

~~~
jordic
Our front service it's a nextjs/nodejs project. But I still prefer python :)
and our API's are python based, typed async python.. but python..

