Hacker News new | past | comments | ask | show | jobs | submit login
Write Fast Apps Using Async Python 3.6 and Redis (paxos.com)
337 points by midas on April 27, 2017 | hide | past | favorite | 128 comments

> we make heavy use of asyncio because it’s more performant

more performant than....what exactly? If I need to load 1000 rows from a database and splash them on a webpage, will my response time go from the 300ms it takes without asyncio to something "more performant", like 50ms? Answer: no. async only gives you throughput, it has nothing to do with "faster" as far as the Python interpreter / GIL / anything like that. If you aren't actually spanning among dozens/hundreds/thousands of network connections, non-blocking IO isn't buying you much at all over using blocking IO with threads, and of course async / greenlets / threads are not a prerequisite for non-blocking IO in any case (only select() is).

it's nice that uvloop seems to be working on removing the terrible performance latency that out-of-the-box asyncio adds, so that's a reason that asyncio can really be viable as a means of gaining throughput without adding lots of latency you wouldn't get with gevent. But I can do without the enforced async boilerplate. Thanks javascript!

I'm glad you said this. There's an async cargo cult going on, where every service must be written in "performant" async code, without knowing the actual resource and load requirements of an application.

From the last benchmark I ran [1] async IO was insignificantly faster than thread-per-connection blocking IO in terms of latency, and marginally faster only after we hit a large number of clients.

Async IO doesn't necessarily make your code faster, it just makes it difficult to read.

[1] http://byteworm.com/evidence-based-research/2017/03/04/compa...

A ~20% improvement in throughput and latency while using 50% less memory (which could allow more workers per-box) is not a "marginal" improvement in my book.

const users = await getUsers();

const tweets = await getTweets(users);


Is async code really harder to read?

Javascript's async feels a bit more natural than Python's.

In Python, you've also got to run the event loop and pass the async function to it. This makes playing with async code in the interpreter more difficult. Also don't forget that async is also turtles all the way up (same as in JS). It'll infect any synchronous code that touches it.

I've written a Tornado app which makes heavy use of asyncio, and while it's pretty efficient, I would reconsider writing it the same way if I had to go back in time.

It's not bad anymore with async/await and promises/futures, but that featureset is still bleeding-edge in most languages. Older-style async code was much more annoying.

In your example the async code doesn't really help anything though - the next statement has to wait for the response from the previous one before continuing.

In your example you'd probably want to be using Promise.all to run two IO operations simultaneously.

The next statement has to wait, but the runtime can yield to another waiting async task so you aren't blocking the total throughput of your program (assuming it's async-all-the-way-down).

The benefits are generally larger-scale than a single method.

Thats hardly applicable async code. You're awaiting the actual async operations, which originally have to be distributed asynchronously from the main thread for these async operations to execute, and at that point its the same speed as just doing sync operations inside of an async operation.

Actual asychronocity, usually with event based systems, gets very ugly, very fast, because you end up having to make callback chains and queueing up your async work. There can be a good benefit to doing it, but its going to be a lot less readable than most sync code, and sometimes not any faster, in the case of Node.JS and its community forcing the usage of async function in places where they don't need to be used.

That code probably represents one function in a event loop webserver processing more than one request at a time. Non blocking behavior is important for work involving UIs.

Throw an exception and look at the stacktrace.

This looks pretty readable to me


Sorcery. Why don't my JS stacktraces look nice? :(

Depends on you dev environment. Almost all of the browser dev tools should catch up eventually. The fun of an ecosystem with multiple competing implementations.

Heh. Somebody will assemble a few of these pieces, add a package manager for async oriented libs, call it node.py, and then market it a bit.

Then you'll really be irritated.

That's... actually not a bad idea. ᕕ( ᐛ )ᕗ

I know it will have an ORM called NodeAlchemy

/calls lawyers

well it can also make things faster. well in your example it won't. but consider you need to load 4 requests and do operations on each of them. if you schedule them in an async fashion you can begin operating on the first one that's ready and not the first one you defined. and this is also often the case. a website does not just do one request to the database. mostly it runs multiple ones and often they don't interfere. like getting 20 rows and the count as a whole, there is just no need to start the first and wait till you have 20 rows and then start the second. you should always start both and wait till you have both.

yes it does not magically make your fetching 100 rows faster or your pbkdf2()/bcrypt() function. you still need to wait for those.

> if you schedule them in an async fashion you can begin operating on the first one that's ready and not the first one you defined.

This type of operation is a given in any production quality webserver, whether it runs with multiple threads and blocking IO or using a non-blocking approach with greenlets. For a web application, this is an implementation detail that should not be explicit within the request handling code (a request handled in the context of a web container after all is a package of data in, a package of data out. no network reading/writing is usually exposed to the web application unless it's trying to expose IO handles to the app, which is unusual). Easy enough with something like Gunicorn.

I think you're talking about different things; the idea is not that you can multiplex the requests coming in, but also the requests going out to the database and etc for each web request handling function.

So on that topic, a request typically has a single transaction going out to the database so within the scope of the request, has to perform its steps in serial in any case. If it needs to make several requests to web services that aren't dependent on each other, that's an area where you can get into stacking them with some kind of concurrency construct (I'd pass it into a greenlet oriented worker pool). but this is already going to be a heavy web request with multiple web service calls.

> a request typically has a single transaction going out to the database

Its typical because people are still in a "single thread single transaction ORM crud" model of thinking. "Its linear because thats how it is"?

if you're using an ACID kind of database then yes, that's how it is :)

> this is already going to be a heavy web request with multiple web service calls

Sure, but it can now be a less heavy web request! ¯\_(ツ)_/¯

> a request typically has a single transaction going out to the database

The fact of the matter is, as applications develop, become richer, and grow larger, it becomes less and less uncommon to have more than one query per page. Especially in the context of larger organizations, it's very common to have everything wrapped behind a service call with an entire armada of infrastructure hidden behind it, and having to make many service calls to put together one web API result or page.


sigh Slight tangent. Look at where we are now and how we came here.

Back in the non-ajax days we used to do them all on the server side, then render the whole page all in one go. This would have come in handy back then! Imagine doing 5x 50ms queries asynchronously, dropping a 250ms response delay down to 50ms! But this stuff was hard back then, and we mostly left it alone.

This is also along the times when we figured out that since we can have pages that take a long time to load and block the interpreter, perhaps it's not such a great idea to serve many requests with a single interpreter, so people started using stuff like nginx to run multiple python interpreters in parallel (not even getting into threads here), which was easier to reason about since each python process is a separate universe that can block entirely, but overall we can still serve a new request with a new interpreter, so for the most part things are good.

Then the twisted people thought that this was silly, and why should we block in the first place, and they decided that the way to fix this was to change the way we program entirely, and re-create or wrap an entire ecosystem of software. It sort of worked, except there wasn't a good twisted package for your thing. But all in all it worked.

Then the greenlets (or one of its other 20 names) people came and wanted to instead use fine-grained implicit concurrency, and said "no no, we can get something with nicer abstraction packaging while mostly not changing the code we have", and that was even nicer, except when something didn't get monkey patched correctly for some reason. We got stuff like gunicorn, which was impressive.

Then as we moved more stuff to the client to create more responsive (in the original meaning of the word) applications, so we pushed the burden of requesting and fetching data to the browser side, which means that as a page loads, it might call REST APIs one by one (hopefully asynchronously!), each of which might make a single (finer-grained) database or service call behind the scenes.

So how different is this now from the gunicorn model? In the latter, you get fine threads of control, each working asynchronously to fetch their own thing, which gets put together in the server side, and then sent back to the client. In the former, you get similarly fine threads of control, but the fine threads perhaps live in their own universes, and it doesn't get all put back together until it travels over the internet to the browser.

So it's a little bit different, but overall what's happening is similar. It feels like we just keep moving concerns and procedures up and down the stack.

Surely there's reasons for all this. Times and technologies change, and we find ways to adapt. I like the "async" stuff because it makes things explicit. It's the middle-ground result of the culmination of our learnings that hiding async behavior makes libraries hard to design and can result in frustrating and unpredictable behavior, whilst changing the entire programming model isn't great either. So we get asyncio. I'm mostly happy with this result. Admittedly this article isn't doing any of this justice.

> it becomes less and less uncommon to have more than one query per page.

I said transaction, not query. A database transaction is on a single connection at a time and queries are performed via the transaction serially.

>I said transaction, not query. A database transaction is on a single connection at a time and queries are performed via the transaction serially.

Unless you're doing e-commerce or banking sites, that's far less common that non-transaction requests.

unless you're using MyISAM or something like that, all your queries are in transactions.

edit: also, I'd challenge you to prove that for a web request that needs to make ten read queries to a relational database, from Python, that you can get better performance by opening up ten separate database connections (or from a pool) and running one query in each, bundled into the async construct of your choice and then merging them all back into your response, vs. just running ten queries on a single connection in serial. Assume these are not slow reporting-style transactions, just the usual "load the users full name, load the current status, load the user's current items", etc., small queries common in a web request that is looking for a very fast response with ten SQL queries.

Note that at the very least, it means your web application needs to use ten times as many database connections for a given set of load. In database-land that's more or less crazy.

Sorry - I wasn't clear enough. Who says it's one single relational database? And besides, like I mention, it's often not relational database queries but a service calls (think microservice architectures, for example). Or both!

Anyway, I respect your position that yes, for the average user, throwing a bunch of "async" in there isn't going to make their code faster, and it's just cargo cult programming. And yes, there is some tradeoff curve where sometimes, for a small benefit, it's not worth the effort to worry about it, as with all things. But it's just a tough sell to argue that no one should need this :-)

More and more often today, the backend serves as glue between frontend clients and a horde of services / data systems. This is often an I/O heavy workload (wait while I make a request, wait for a response, wait while I download x10). This kind of workload is ripe for speeding up with async. That's all I'm saying!

It's not uncommon to have ~20 pooled connections lying around. Maybe it's not that frequently used in Python or PHP, but in various other platforms, that's just the normal case.

At least in Java, C#, Golang. And even psycopg2 offers a Pooling Abstraction (I guess it's not used in Django, but SQLAlchemy offers that aswell) But of course running a blocking driver atop a non-blocking framework does not give the best performance.

However just challenging it without proof is not really that useful.

Also some workloads are better for Threaded Servers while others are better in Async Fashion, it's also highly unlikely that just wrapping your Database connection in a Async function that it will be faster or better suited for a async workload. If you are not non-blocking from the ground up you will still carry a lot of overhead around.

> It's not uncommon to have ~20 pooled connections lying around. Maybe it's not that frequently used in Python or PHP, but in various other platforms, that's just the normal case.

OK but you're doing....500 req/s let's say, so, if base latency is 50ms, you're going to have at least 25 requests in play at once, so that's 500 database connections. That's one worker process. If your site is using....two app servers, or your web service has multiple worker processes, or etc., now you have 1000, 1500, etc. database connections in play at capacity. This is a lot. Not to mention you'd better be using a middleware connection pool if you have that many connections per process to at least reduce the DB connection use for processes that aren't at capacity.

On MySQL, each DB connection is a thread (MariaDB has a new thread pooling option also), so after all the trouble we've gone to not use threads, we are stuck with them anyway. On Postgresql, each DB connection is a fork(), and they also use a lot of memory. In both of these cases, we have to be mindful of having too many connections in play for the DB servers to perform well. We're purposely using many, many more DB connections than we need on the client side to try to grab at some fleeting performance gain by stacking small queries among several transactions/connections per request which is not how these databases were designed to be used (a DB like Redis, sure, but an RDBMS, not so much), and on the client side, I still argue that the overhead of all the async primitives is going to be in a very tight race to not be ultimately slower than running the queries in serial (plus the code is much more complicated), and throughput across many requests is reduced using this approach. Marginal / fleeting gains on the client vs. huge price to pay on the server + code complexity + ACID is gone makes this a pretty tough value proposition.

Postgresql wiki at https://wiki.postgresql.org/wiki/Number_Of_Database_Connecti...: "You can generally improve both latency and throughput by limiting the number of database connections with active transactions to match the available number of resources, and queuing any requests to start a new database transaction which come in while at the limit. ". Which means stuffing a load of connections per request means you're limiting the throughput of your applications....and throughput is the reason we'd want to use non-blocking IO in the first place.

> However just challenging it without proof is not really that useful.

this is all about a commonly made assertion (async == speed) that is never shown to be true and I only ask for proof of that assertion. Or maybe if blog posts like this one could be a little more specific in their language, which would go a long way towards bringing people back into reality.

well all your assertions are wrong. you think that there is only one database and no read only slaves. you also think that we always need strong serializability and acid. guess what? a users does not care if he needs to reload the page until his picture is online.

yes there are workloads, where everything you says is true. but most other workloads, like 80% of all the web pages don't need what you describe.

also some pages don't have a conventionell database at all. some people have a cache or some other services in place, some people use microservices, some people connect to other internet providers, other services like lpd/ipp etc. the world is just not black and white. everything what you describe is uterly crap since you just try to talk around, cause your application is not as complex as others. and yes in prolly 60-70% of the cases async will not yield more "speed"/"performance" however you call it.

> cause your application is not as complex as others

I work with Openstack. I don't think you're going to find something more complicated :). (it does use eventlet for most services , though it's starting to move away from that model back to mod_wsgi / threads).

not everything is transaction centric. and also before I make a transaction I mostly fetch various stuff before and sometimes after.

and also my count example, it just makes no sense to have the count and the list data called inside a transaction (ok there are cases, but these are way more rare, because mostly It's not to bad to give users a wrong count, you don't need strict Serializability)

see my edit at https://news.ycombinator.com/item?id=14218862 where I propose a challenge to show that it's more efficient to use ten relational database connections for a request that needs to run ten small queries, vs running ten queries on a single DB connection.

In many non trivial cases I tend to find that I have to query several different databases to render a single page.

depending on the latency of those database connections I've argued in the past that the overhead of adding asyncio context switching and boilerplate is more expensive than just hitting the two or three databases in serial (and if your web request is having to hit dozens of DB sources to serve one request, I think you've already lost the performance game :) ). When your one web request is contending with many other concurrent web requests in any case, doing the DB calls in serial just lets the CPU attend to other requests.

Do you think it make more sense to do async backend when we are moving to real-time ( meaning websocket-based connections ) web apps?

I've maintained that async is better suited towards web services and lightweight databases like redis, and is not useful for relational databases. However, it's very hard to get async to make your code actually "faster", as opposed to just handling very high throughout with less resources. If people stop saying "faster!", I'll go away.

Will my response time go from the 300ms it takes without asyncio to something "more performant", like 50ms

If you have to do 1000 queries it could, since could async will make it feasible to do them parallel. If it's a single query, maybe async would make it feasible to shard the database.

you usually see this pattern in ORMs with n+1 querys . If a single request requires 1000 db queries it is better to be optimising the query

It buys you the stack size of each thread which only matters if you have a stupid amount of connections. In this article[1] the author makes a comparison between the 2 models and 7000 concurrent users will chew up 450MB of stack space. Of course this is adjustable.

[1] http://byteworm.com/evidence-based-research/2017/03/04/compa...

On most Linux systems stack is allocated with mmap with overcommiting. Until first write all those pages will share same zeroed page AFAIK. Then only overwritten pages will be allocated.

Am I wrong?

How do you save on stack space with asyncio? Don't you have to keep the coroutine object in memory somewhere?

I think the idea is that these "coroutine objects" (or the equivalent structure in whatever language) is smaller than the typical stack size for a thread. For example, the default stack size on Windows is 1 MB. So if you have a thread per connection, obviously this is going to take up a decent amount of memory. I'm guessing the answer to this is a thread pool so your memory usage doesn't blow up.


> more performant than....what exactly? If I need to load 1000 rows from a database and splash them on a webpage, will my response time go from the 300ms it takes without asyncio to something "more performant", like 50ms?

Potentially, it depends on if you can do other tasks for the same request that don't depend on the data. You might be able to render most of the page for instance. It's not purely about throughput.

Please tell me that 300ms was made up too and that it's not really taking that long.

https://magic.io/blog/uvloop-blazing-fast-python-networking/... from the makers of uvloop (for a toy example)

it seems the main bottleneck when using aiohttp is aiohttp itself, which practically makes the use of uvloop irrelevant

If you have to make several requests to db backend to fulfil one response then potentially asyncio allows you to make them in parallel rather than in series. Reducing latency of your response.

> If I need to load 1000 rows from a database and splash them on a webpage, will my response time go from the 300ms it takes without asyncio to something "more performant", like 50ms? Answer: no

Well, actually, yes. Without async rendering, your webpage is not ready until your 1000 rows of list is placed in Python memory then rendered to HTML as a whole then returned to your browser after like 300ms of server cost.

With async rendering, your webpage's headers and such can be returned immediately, thus your first-byte-to-response time can be done under 50ms, and your page loads by enumerating the rest of 1000 rows and renders the page incrementally.

Well you can do all of that sync, can't you?

    def on_connection:
        send(start of page)
        for row in db:
will have the exact same effect as what you said (not like that applies regardless, I don't think jinja outputs partial renders, since its made for flask)

The performance comparison is between python managed green threads, and OS managed actual threads. You don't get any new features

Another point is your server can switch context to handle other requests with async.

In real world, your web page consists more than one db (like mysql + redis + some RPC calls to microservices) queries, with async apis, you can concurrently request for all queries at once and join them all at rendering.

The async benefits can add up to a much faster responsive server.

Yes, those are threads when handled by the OS / greenthreads when handled by the program.

a program with threads can support multiple requests simultaneously. a program with green threads can support multiple requests simultaneously.

You arn't giving any reasons why green threads in python perform better than threads in the OS.

Well, threads also switch context.

That's a client streaming optimization, not related to the subject at hand which is non-blocking network IO. Assume the service returns a JSON structure. It won't get to the end any faster.

There must exists a module like `ijson` which could incrementally generate JSON.

I went down this rabbithole once, and turns out you /can/ do something like this, having everything streaming all the way from the database to python to the web server to the client. The problem then was that even after all that effort, whatever javascript usually was processing that in a non-streaming way.

Then I found this http://oboejs.com/ and it was even more work, and I gave up. In the end it required rethinking everything and battling against a whole set of tools and libraries that just didn't think that way.

You are the hero we need, Mike

We've just recently started using Sanic[0] paired with Redis to great effect for a very high throughput web service. It also uses Python 3 asyncio/uvloop at its core. So far very happy with it.

[0] https://github.com/channelcat/sanic

There is also ApiStar - https://github.com/tomchristie/apistar

It's built by Tom Christie - the original author of Django Rest Foundation.


Interesting, hadn't seen that yet; thanks for sharing. Does look like it'll have some nice design concepts -- I've definitely come to view Django/DRF's strong coupling to the ORM as a hinderance to architectural flexibility/sanity as my application has grown.

Interestingly it eschews Swagger/OpenAPI in favour of JSON Schema, wonder how that'll pan out; I like the promise of codegen that swagger offers, but haven't found the generated clients to be particularly usable.

One thing to clarify here. Swagger/OpenAPI use JSON Schema in order to describe parameters and response structures. The rest of the schema work will start to fall into place pretty quickly now that we've got the groundwork done. Swagger generation based on the annotations will be one of the features, but there'll be plenty more to get excited about too.

Isn't Swagger a subset of JSON Schema though? [1]

If APIStar happens to target the same subset, that's not a problem of course.

[1]: http://stackoverflow.com/a/32386131/37481

any particular reason you are using BSD license ? With all due respect, this does not cover a patent grant like the Apache license and could be a poison pill for companies to adopt.

I just had a quick scan over the licenses of other projects used for server side projects. Projects using BSD/MIT include Node, Go, Rails, Django, and Flask.

I'm happy with the choice.

Sorry to interject - but that's not completely true. Go comes with a separate patent disclaimer.


And IMHO nodejs is not a standard BSD license and comes with patent grant. That discussion went on for a year in the TSC . https://github.com/nodejs/node/blob/master/LICENSE

In general, this stuff is not always evident. But the BSD license by itself is not as good as Apache.

While in doubt, use Apache !

P.S. fyi, doing this later is super heavy-duty hard.

I have never found a good example of a Python web server that provides some mechanism for statefulness. Is it just fundamentally not possible to have shared state among requests handled by the threads of a process? Sanic's examples seem to be the same as Flask's: self-contained function calls attached to endpoints.

I keep hitting a wall with Python when I want to do something like:

1. subscribe to a websocket connection and keep the last received message in state 2. expose an http endpoint to let a client GET that last message.

You normally use something like redis to store the state.

If you were going to share state in memory between threads, how would you handle the case where the second request goes to a different server or that the process has restarted? You'd need redis anyway, so you might as well just use it in all cases.

I get that everyone's responses are thinking some big public thing. I'm thinking small toy implementation for my home network.

The toy experiment is how to do what's trivial in Node with Python. Mainly because I like working with python. I think the answer might be: Python is the wrong tool for the job.

Erm, no. You can do shared thread storage, in Python, it's just that it doesn't really scale. I've done it for small daemons without significant hassle, and even wrote my own Go-like CSP helper: https://github.com/rcarmo/python-utils/blob/master/taskkit.p...

The problem is that accessing shared state concurrently in a multi-process context is a non-trivial problem, so specific software emerged that handles these problems for you.

The simplest solution is to use a small DB system like sqlite. It is built into Python (import sqlite3) performs reasonably well and you do not have to run an additional service.

Now if a small DB like sqlite already feels overblown to you (and it really is simple and small) you might not need concurrent access either, so the simplest solution is to just use a file where you store your state.

I'd say the simplest solution is a global (or just shared) variable using a thread-safe container like queue.Queue. There's also multiprocessing.Queue, which supports sharing the queue across multiple workers.

Having multiple nodejs processes is the same thing as having multiple python processes w/ regards to sharing state.

What you're referring to works equally well in the single-process case for both.

This maybe seems complicated because Node has 1 obvious way to run (single threaded with asynchronous functions) but Python has a few ways (single threaded, multithreaded, ioloops kind of like Node, greenlets).

Python is excellent for toy implementations, and real ones too in many cases.

Is multithreading really necessary for a toy?

It probably is for long-polling or websockets?

https://github.com/mkj/wort-templog/blob/master/web/templog.... is my not-quite-toy example - a single process runs from uwsgi with Bottle (like Flask) and gevent. The long polling waits on a global Event variable that's updated by another request, nice and simple.

Is that really something you want to do in-memory? Once you have to start multiple worker processes or application servers behind a load balancer, you'll have to re-implement it with some sort of shared persistant store like Redis.

Not really, you can use a multiprocessing.Manager to share a plain old dict or list across multiple worker processes: https://docs.python.org/3/library/multiprocessing.html#shari...

I think the OP was talking about different VMs behind a load balancer where there is no shared memory at all.

Wasn't my impression (using worker processes in the same machine is common), but fair enough. On the other hand, message passing across machines is overrated. We run a SaaS service on 25 VMs with no communication between them for regular operation.

Flask lets you share state between requests; just have the route methods reference some global variable. You could also apply the route decorator to an instance method (although probably not using the decorator syntax).

You can use caching to mimic this behaviour in Flask, I.e.


I'm not sure how this works with multiple threads though, I imagine you would have to synchronize it yourself.

Redis's lpush, and rpop plus some naming scheme might suffice. Everything is atomic, so the thread bit is covered.

Your request has stayed on my mind over the past few days, so I put this together for you: https://github.com/pdmccormick/sample-socketio-chat-app


I am a little confused here. What's keeping you from storing your state in a global variable?

Any idea how Sanic compares with Falcon? I read somewhere recently that Falcon was quite fast. I tried out Hug, which is built on Falcon, but only for a small demo app, not done any benchmarking.

problem with Sanic that it does not implement streaming properly. so it is very easy to kill any Sanic process, 10-20 seconds. again any.

Would you mind elaborating?

check Sanic code, it loads whole incoming payload into memory before processing it. event for 404, so I can write very simple script that would consume all memory. and you can not really protect sanic service with proxy (nginx)

Been doing the same thing: Sanic is great!

Can anyone recommend a good book to get started on concurrency, with discussions of models, and a few implementations such as golang and python 3.5+?

While I can write this kind of code, I don't feel like I completely understand some of the concepts.

I'm not far, but Seven Concurrency Models in Seven Weeks is pretty good and might fit what you're looking for.


I recommend the book very much. However it doesn't have a chapter the single-threaded concurrency handling (with eventloops, and futures/promises and sometimes even plain callbacks), which is currently en-vogue in lots of languages (JS, python asyncio, boost asio, etc). So this is something one should look up elsewhere.

Do you have any reading material that you would recommend for single-threaded concurrency?

Concurrency and parallelism is such a huge landscape of difficult problems and complexity that I doubt any such introduction exists. I never found one, anyway.

Oh, that's hilarious.

To their defense, it seems to be an app that demonstrates usage of the library. Also seems to used for benchmarking. That would explain why the Redis database can be easily flushed through a simple URL.

That's actually a call to a rickroll.

It still flushes the db first?

Hence the ouch.

> Write Fast Apps Using Async Python

When working with Python and Ruby I find 80ms responses acceptable. In very optimized situations (no framework) this can do down to 20ms.

Now I've used some Haskell, OCaml and Go and I have learned that they can typically respond in <5ms. And that having a framework in place barely increases the response times.

In both cases this includes querying the db several times (db queries usually take less then a millisecond, Redis shall be quite similar to the extend that it does not change outcome).

<5ms makes it possible to not worry about caching (and thus cache invalidation) for a much longer time.

I've come to the conclusion that --considering other languages-- speed is not to be found in Python and Ruby.

Apart from the speed story there's also resource consumption, and in that game it is only compiled languages that truly compete.

Last point: give the point I make above and that nowadays "the web is the UI", I believe that languages for hi-perf application development should: compile to native and compile to JS. Candidates: OCaml/Reason (BuckleScript), Haskell (GHCJS), PureScript (ps-native), [please add if I forgot any]

You can get 2-3 ms response time (sans network) with any of Django, Flask and Pyramid. Database queries tend to eat a lot, esp. if the queries are bad (long wait in the DBMS or post-filtering in Python/whichever); sometimes ORMs can eat a fair bit as well. But it's fairly rare to get that low, most pages for me (that I cared about) will take 10-30 ms. Using the correct tools and the right approach is fruitful as always.

> You can get 2-3 ms response time (sans network) with any of Django, Flask and Pyramid.

Wow, never managed to do that. Maybe I have to try it again (last time checked on Django was some years ago).

Truth. The best I can get in Django is 30 ms.

> Paxos.com

I'm confused by the relationship between Paxos, the company, and Paxos, the algorithm. Do the authors of Paxos work for Paxos?



Ah; both are named for a fictional financial systen

By the way, the author of the original Paxos paper is Leslie Lamport, who currently works at Microsoft Research.

The title is misleading. The blog post doesn't cover how fast using async python is, it's a tutorial on how to use their ORM redis library.

There's a link on the page which digs in a bit more:


And that topic was discussed in HN previously with 130 comments:


>>> The performance of uvloop-based asyncio is close to that of Go programs.

I would prefer standard benchmarks for this. I hope they submit their framework to TechEnpower benchmarks.


Those benchmarks aren't any more standard than anything else.

Yes but you can see the most number of frameworks there running on the same hardware and same settings doing the same job. Also you can see the configuration how to achieve that.

except that various frameworks highly depend on their configuration/version/coding style/linux configuration/memory used/cpu's used/use case. it's also important that some frameworks behave better when they are warm. also some code behave's differently when you connect with a single client to make requests via wrk, vs a aggregate of multiple clients. they still use wrk and not wrk2, their error rate is pretty high and their framework is well not always well behaving.

besides all that, it's just simple cases that they are testing. I would never ever trust this site or any result they got.

> You get the benefits of a database, with the performance of RAM!

One of the benefits of modern RDBMS is that they make extremely sophisticated use of RAM, and all levels of fast to slow storage below that SSD / RAIDs / slow single spindle.

Quite related, but if you want to use Redis as a SQL database I wrote an extension to do just that: https://github.com/RedBeardLab/rediSQL

It is a relative thin layer of rust code between the Redis module interface and SQLite.

At the moment you can simply execute statements but any suggestion and feature request is very welcome.

Yes, it is possible to do join, to use the LIKE operator and pretty much everything that SQLite gives you.

It is a multi-thread module, which means that it does NOT block the main redis thread and perform quite well. On my machine I achieved 50.000 inserts per seconds for the in memory database.

If you have any question feel free to ask here or to open issues and pull request in the main repo.


This is pretty neat. I've been using a plain Redis wrapper (aioredis) with uvloop and Sanic (https://github.com/rcarmo/newsfeed-corpus), but I'm going to have a peek at subconscious.

>One of the common complaints people have about python and other popular interpreted languages (Ruby, JavaScript, PHP, Perl, etc) is that they’re slow.

Proceeds to show an animation of posting a blog post that performs no faster than if it was built using Django.

> 10k pageviews took ~41s

Might be that the server is insanely slow, but I would have no problems reaching 10k page views per second with some basic PHP and even MariaDB on a low end E3-1230 server. Pretty sure more would be quite easy to...

It seems strange that they would claim that Python's libuv based event loop is twice as fast as Node.js's libuv based event loop. There's some context missing to that statement or it's flat out false.

What does it even mean. The event loop is only used when there is nothing going on. Is it faster at doing nothing ?

> The event loop is only used when there is nothing going on.

In async applications event loop is what actually executes your code and performs IO. In essence, event loops are under load all the time.

If you want performance don't use Python.

I hope the downvotes are not due to people thinking you can actually write high performance applications in Python.

Sadly true. Python is great for scripting, but "high performance Python" is frequently a challenge better suited to other tools.

"High performance Python" is usually done by "offloading literally everything to native extensions" :D

Until JIT authors start taking advantage of the frame execution API that was added in 3.6.

This is to get a high performance ready app out. You could probably get an app out faster in PHP or Meteor or other prototyping framework.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact