
Write Fast Apps Using Async Python 3.6 and Redis - midas
https://eng.paxos.com/write-fast-apps-using-async-python-3.6-and-redis
======
zzzeek
> we make heavy use of asyncio because it’s more performant

more performant than....what exactly? If I need to load 1000 rows from a
database and splash them on a webpage, will my response time go from the 300ms
it takes without asyncio to something "more performant", like 50ms? Answer:
no. async only gives you throughput, it has nothing to do with "faster" as far
as the Python interpreter / GIL / anything like that. If you aren't actually
spanning among dozens/hundreds/thousands of network connections, non-blocking
IO isn't buying you much at all over using blocking IO with threads, and of
course async / greenlets / threads are not a prerequisite for non-blocking IO
in any case (only select() is).

it's nice that uvloop seems to be working on removing the terrible performance
latency that out-of-the-box asyncio adds, so that's a reason that asyncio can
really be viable as a means of gaining throughput without adding lots of
latency you wouldn't get with gevent. But I can do without the enforced async
boilerplate. Thanks javascript!

~~~
bysin
I'm glad you said this. There's an async cargo cult going on, where every
service must be written in "performant" async code, without knowing the actual
resource and load requirements of an application.

From the last benchmark I ran [1] async IO was insignificantly faster than
thread-per-connection blocking IO in terms of latency, and marginally faster
only after we hit a large number of clients.

Async IO doesn't necessarily make your code faster, it just makes it difficult
to read.

[1] [http://byteworm.com/evidence-based-
research/2017/03/04/compa...](http://byteworm.com/evidence-based-
research/2017/03/04/comparing-the-performance-of-synchronous-versus-
asynchronous-network-io/)

~~~
tuxracer
const users = await getUsers();

const tweets = await getTweets(users);

console.log(tweets);

Is async code really harder to read?

~~~
untog
In your example the async code doesn't really help anything though - the next
statement has to wait for the response from the previous one before
continuing.

In your example you'd probably want to be using Promise.all to run two IO
operations simultaneously.

~~~
cronin101
The next statement has to wait, but the runtime can yield to another waiting
async task so you aren't blocking the total throughput of your program
(assuming it's async-all-the-way-down).

The benefits are generally larger-scale than a single method.

------
erikcw
We've just recently started using Sanic[0] paired with Redis to great effect
for a very high throughput web service. It also uses Python 3 asyncio/uvloop
at its core. So far very happy with it.

[0] [https://github.com/channelcat/sanic](https://github.com/channelcat/sanic)

~~~
Waterluvian
I have never found a good example of a Python web server that provides some
mechanism for statefulness. Is it just fundamentally not possible to have
shared state among requests handled by the threads of a process? Sanic's
examples seem to be the same as Flask's: self-contained function calls
attached to endpoints.

I keep hitting a wall with Python when I want to do something like:

1\. subscribe to a websocket connection and keep the last received message in
state 2\. expose an http endpoint to let a client GET that last message.

~~~
bobbyi_settv
You normally use something like redis to store the state.

If you were going to share state in memory between threads, how would you
handle the case where the second request goes to a different server or that
the process has restarted? You'd need redis anyway, so you might as well just
use it in all cases.

~~~
Waterluvian
I get that everyone's responses are thinking some big public thing. I'm
thinking small toy implementation for my home network.

The toy experiment is how to do what's trivial in Node with Python. Mainly
because I like working with python. I think the answer might be: Python is the
wrong tool for the job.

~~~
zeptomu
The problem is that accessing shared state concurrently in a multi-process
context is a non-trivial problem, so specific software emerged that handles
these problems for you.

The simplest solution is to use a small DB system like sqlite. It is built
into Python (import sqlite3) performs reasonably well and you do not have to
run an additional service.

Now if a small DB like sqlite already feels overblown to you (and it really
_is_ simple and small) you might not need concurrent access either, so the
simplest solution is to just use a file where you store your state.

~~~
icebraining
I'd say the simplest solution is a global (or just shared) variable using a
thread-safe container like queue.Queue. There's also multiprocessing.Queue,
which supports sharing the queue across multiple workers.

------
mixmastamyk
Can anyone recommend a good book to get started on concurrency, with
discussions of models, and a few implementations such as golang and python
3.5+?

While I can write this kind of code, I don't feel like I completely understand
some of the concepts.

~~~
mrks_
I'm not far, but Seven Concurrency Models in Seven Weeks is pretty good and
might fit what you're looking for.

[https://pragprog.com/book/pb7con/seven-concurrency-models-
in...](https://pragprog.com/book/pb7con/seven-concurrency-models-in-seven-
weeks)

~~~
Matthias247
I recommend the book very much. However it doesn't have a chapter the single-
threaded concurrency handling (with eventloops, and futures/promises and
sometimes even plain callbacks), which is currently en-vogue in lots of
languages (JS, python asyncio, boost asio, etc). So this is something one
should look up elsewhere.

~~~
victorvation
Do you have any reading material that you would recommend for single-threaded
concurrency?

~~~
cristobal23
[http://krondo.com/in-which-we-begin-at-the-beginning/](http://krondo.com/in-
which-we-begin-at-the-beginning/)

------
michaelmcmillan
Ouch: [https://github.com/paxos-
bankchain/pastey/blob/master/app.py...](https://github.com/paxos-
bankchain/pastey/blob/master/app.py#L99)

~~~
secstate
That's actually a call to a rickroll.

~~~
cdelsolar
It still flushes the db first?

------
cies
> Write Fast Apps Using Async Python

When working with Python and Ruby I find 80ms responses acceptable. In very
optimized situations (no framework) this can do down to 20ms.

Now I've used some Haskell, OCaml and Go and I have learned that they can
typically respond in <5ms. And that having a framework in place barely
increases the response times.

In both cases this includes querying the db several times (db queries usually
take less then a millisecond, Redis shall be quite similar to the extend that
it does not change outcome).

<5ms makes it possible to not worry about caching (and thus cache
invalidation) for a much longer time.

I've come to the conclusion that --considering other languages-- speed is not
to be found in Python and Ruby.

Apart from the speed story there's also resource consumption, and in that game
it is only compiled languages that truly compete.

Last point: give the point I make above and that nowadays "the web is the UI",
I believe that languages for hi-perf application development should: compile
to native and compile to JS. Candidates: OCaml/Reason (BuckleScript), Haskell
(GHCJS), PureScript (ps-native), [please add if I forgot any]

~~~
dom0
You can get 2-3 ms response time (sans network) with any of Django, Flask and
Pyramid. Database queries tend to eat a lot, esp. if the queries are bad (long
wait in the DBMS or post-filtering in Python/whichever); sometimes ORMs can
eat a fair bit as well. But it's fairly rare to get that low, most pages for
me (that I cared about) will take 10-30 ms. Using the correct tools and the
right approach is fruitful as always.

~~~
cies
> You can get 2-3 ms response time (sans network) with any of Django, Flask
> and Pyramid.

Wow, never managed to do that. Maybe I have to try it again (last time checked
on Django was some years ago).

~~~
dismantlethesun
Truth. The best I can get in Django is 30 ms.

------
jitl
> Paxos.com

I'm confused by the relationship between Paxos, the company, and Paxos, the
algorithm. Do the authors of Paxos work for Paxos?

Edit:

[https://en.m.wikipedia.org/wiki/Paxos_(computer_science)](https://en.m.wikipedia.org/wiki/Paxos_\(computer_science\))

Ah; both are named for a fictional financial systen

~~~
lou1306
By the way, the author of the original Paxos paper is Leslie Lamport, who
currently works at Microsoft Research.

------
ipsum2
The title is misleading. The blog post doesn't cover how fast using async
python is, it's a tutorial on how to use their ORM redis library.

~~~
mixmastamyk
There's a link on the page which digs in a bit more:

[https://magic.io/blog/uvloop-blazing-fast-python-
networking/](https://magic.io/blog/uvloop-blazing-fast-python-networking/)

~~~
jackbravo
And that topic was discussed in HN previously with 130 comments:

[https://news.ycombinator.com/item?id=11625585](https://news.ycombinator.com/item?id=11625585)

------
StreamBright
>>> The performance of uvloop-based asyncio is close to that of Go programs.

I would prefer standard benchmarks for this. I hope they submit their
framework to TechEnpower benchmarks.

[https://www.techempower.com/benchmarks/](https://www.techempower.com/benchmarks/)

~~~
pekk
Those benchmarks aren't any more standard than anything else.

~~~
StreamBright
Yes but you can see the most number of frameworks there running on the same
hardware and same settings doing the same job. Also you can see the
configuration how to achieve that.

~~~
merb
except that various frameworks highly depend on their
configuration/version/coding style/linux configuration/memory used/cpu's
used/use case. it's also important that some frameworks behave better when
they are warm. also some code behave's differently when you connect with a
single client to make requests via wrk, vs a aggregate of multiple clients.
they still use wrk and not wrk2, their error rate is pretty high and their
framework is well not always well behaving.

besides all that, it's just simple cases that they are testing. I would never
ever trust this site or any result they got.

------
njharman
> You get the benefits of a database, with the performance of RAM!

One of the benefits of modern RDBMS is that they make extremely sophisticated
use of RAM, and all levels of fast to slow storage below that SSD / RAIDs /
slow single spindle.

------
siscia
Quite related, but if you want to use Redis as a SQL database I wrote an
extension to do just that:
[https://github.com/RedBeardLab/rediSQL](https://github.com/RedBeardLab/rediSQL)

It is a relative thin layer of rust code between the Redis module interface
and SQLite.

At the moment you can simply execute statements but any suggestion and feature
request is very welcome.

Yes, it is possible to do join, to use the LIKE operator and pretty much
everything that SQLite gives you.

It is a multi-thread module, which means that it does NOT block the main redis
thread and perform quite well. On my machine I achieved 50.000 inserts per
seconds for the in memory database.

If you have any question feel free to ask here or to open issues and pull
request in the main repo.

:)

------
rcarmo
This is pretty neat. I've been using a plain Redis wrapper (aioredis) with
uvloop and Sanic ([https://github.com/rcarmo/newsfeed-
corpus](https://github.com/rcarmo/newsfeed-corpus)), but I'm going to have a
peek at subconscious.

------
VT_Drew
>One of the common complaints people have about python and other popular
interpreted languages (Ruby, JavaScript, PHP, Perl, etc) is that they’re slow.

Proceeds to show an animation of posting a blog post that performs no faster
than if it was built using Django.

------
NightlyDev
> 10k pageviews took ~41s

Might be that the server is insanely slow, but I would have no problems
reaching 10k page views per second with some basic PHP and even MariaDB on a
low end E3-1230 server. Pretty sure more would be quite easy to...

------
fritzy
It seems strange that they would claim that Python's libuv based event loop is
twice as fast as Node.js's libuv based event loop. There's some context
missing to that statement or it's flat out false.

~~~
GroSacASacs
What does it even mean. The event loop is only used when there is nothing
going on. Is it faster at doing nothing ?

~~~
1st1
> The event loop is only used when there is nothing going on.

In async applications event loop is what actually executes your code and
performs IO. In essence, event loops are under load all the time.

------
hasenj
If you want performance don't use Python.

~~~
twistedpair
Sadly true. Python is great for scripting, but "high performance Python" is
frequently a challenge better suited to other tools.

~~~
floatboth
"High performance Python" is usually done by "offloading literally everything
to native extensions" :D

------
theprop
This is to get a high performance ready app out. You could probably get an app
out faster in PHP or Meteor or other prototyping framework.

