Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Can you use Django with those optimisations or are they good mainly for scientific computing?


You can certainly use it, but whether you see any benefit is going to strongly depend on your workload. If you're doing significant calculations in the API then it might be considerably more performant, but if your API is primarily retrieving things from the database and transforming it to JSON then you're going to be limited mostly by the database latency and so I wouldn't expect major improvements.


If you are fetching lots (not even 'big data', but a few thousand rows) of data using the Django ORM, you will see a performance difference when using pypy, or at least I did a few years ago. The database can happily return a few thousand rows very quickly, especially if you take care to optimize your queries and have good indexes.

Converting a few thousand rows to python/django objects takes _time_. I can't quantify anything, because it's been too long, but I remember it being fairly significant. When I profiled it, the majority of the time was spent calling __setattr__ a few million times.

Like you said, it depends on your use case. If your queries are slow, then optimize your database queries. But if your queries are fast and your responses are still slow, then investigating pypy is definitely worth it. You can also play around with .values_list or something in Django, so that you get 'raw values' instead of objects (but there's still a cost to building them up).


And yet the same supposedly "io bound" workloads (like "parse request, fetch something from DB, return it as JSON") still have widely difference performance characteristics in some languages vs others, with 10x to 100x requests handled per second...


2x, yes. If you’re seeing 100x you’re comparing different things like creating and serializing complex objects versus simple types or using a JSON parser which loads an entire document into objects versus one which only retrieved specific values.


Well, perhaps not 100x in rps, but 10x sure:

Overall top performing frameworks (JS, Java, and Rust) at 650K rps. That's 7x over the top Python based framework at: 86K rps.

And another very popular Python framework (flask) gets just to 2K rps. That's 325 times worse to the best.

And that's the "single DB query" benchmark: https://www.techempower.com/benchmarks/#section=data-r21&tes...

https://github.com/TechEmpower/FrameworkBenchmarks/wiki/Proj...


At my old gig we ran Django and FastAPI with pypy. I don't remember there being too many issues. One thing is that pypy versions lag behind official python, so if you're using bleeding edge stuff, it won't be supported in pypy yet.

That was a couple of years ago at this point, and I've not been in the python ecosystem since then, but I can only imagine things are getting better in that regard rather than worse.


Have a look at Cinder - https://github.com/facebookincubator/cinder - it's Meta's performance oriented fork of CPython that they use to run Instagram (which is a big Django app).


I always wondered with Cinder why they didn't turbocharge PyPy development instead.


With a codebase of any significant size the priority is always to maintain compatibility while improving performance.

If you start with an incompatible, highly performant interpreter, the compatibility "distance" is difficult to measure and could create unknown performance cost. For example, PyPy doesn't support C modules due to the differing memory layout.


Two main reasons:

* C extensions

* Instagram's forking server model

I gave a talk that touched on some of this last year: https://2022.ecoop.org/details/ICOOOLPS-2022-papers/5/Cinder...

I wonder why the recording is not up...


I had little trouble switching a Django app years ago but the results were mixed. Some complicated views and reports saw a hefty win but most of the app was database-limited and optimized to the point that there was no meaningful difference, except that PyPy used more RAM.


Pypy is great but I didn't find it very useful with Django.

Quick, transactional HTTP exchanges (GET, POST, etc.) aren't really its thing-- there's no time for the compiler to get warmed up; the request is complete before pypy has gotten out of bed.

But if you have to do really complex view rendering (graphs or something) where it would take cpython ~10s or more to process, then pypy will leave cpython in the dust.


I've never used PyPy, but shouldn't it warm after the Nth request even if you don't have loops?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: