Python 3.11 Performance Benchmarks Are Looking Fantastic

axblount · on June 6, 2022

I hope Sam Gross's no-GIL changes are adopted. It would be amazing to have full multithreaded concurrency in python.

https://mail.python.org/archives/list/python-dev@python.org/...

albertzeyer · on June 6, 2022

https://news.ycombinator.com/item?id=28880782 https://news.ycombinator.com/item?id=29005573 https://news.ycombinator.com/item?id=31348097

Note that Sam's no-GIL changes are not only about the removal of the GIL (although that was the main goal). There are a number of other unrelated improvements to make it faster. And as far as I know, most of those unrelated improvements have gotten into CPython now.

rat87 · on June 6, 2022

I'm worried about it. My understanding is that the GIL removal was still slower then with GIL, but that he added useful unrelated speedups to compensate. If the speedups get into 3.11 but no gil doesn't will they accept a significantly slower 3.12?

__d · on June 7, 2022

Yeah, I'm worried too.

It seems like the core team doesn't share my sense that nogil is the single most important thing that Python needs to do in a world of many-core processors. If nogil is now blocked because it's deemed an unacceptable performance hit vs 3.11, I'll be very, very disappointed.

O5vYtytb · on June 7, 2022

Most python code in the world is currently single threaded. It makes sense to me to start with the low hanging fruit that affects the majority of existing code and then move on to harder problems.

__d · on June 7, 2022

Sure, but Guido put a pre-condition on GIL removal years ago that it must not cause single-threaded code to suffer a performance penalty.

So Sam came up with a way to ensure that (on balance, if not in every example), we could have same-or-better performance and no GIL. Awesome!

But now we're getting the performance boosts without the GIL removal, and so in the next release, the GIL removal will cause a performance regression unless we can somehow find more performance boosts. It feels like this could just happen forever.

O5vYtytb · on June 8, 2022

There's many more improvements to be made in single threaded performance, including things like JIT. Future improvements are complicated, they won't be a free lunch like this first round has generally been. My feeling is that things like JIT and nogil will start out as optional and work their way into the ecosystem over time where the performance implications are less severe for the general use case.

zackees · on June 7, 2022

Why? Python gives you the ability to spawn processes so there's always an escape hatch for building multi-core code.

__d · on June 7, 2022

Right. But as you implicitly note, it's a work-around ("escape hatch"), not a feature.

eg. Unix had processes from day 1, but threads were added (with significant effort) because threads are a better abstraction for addressing lot of problems. Especially so when your CPU has lots of cores.

The work to make multiprocessing better is certainly useful, and valuable, but it's still a work-around.

jacob019 · on June 6, 2022

It might make sense to have runtime configuration that enables no-GIL mode, or just enable it automatically when importing threading, so that non-threaded workloads don't suffer.

O5vYtytb · on June 7, 2022

The faster cpython project isn't done after 3.11. they're working on many other improvements for 3.12 and we'll see other things possibly a JIT or at least an API to facilitate accelerators in 3.13+

I'm not worried about a lack of single thread performance increases for some time.

btown · on June 6, 2022

Quick link to the (incredible) design doc linked in the first post there: https://docs.google.com/document/d/18CXhDb1ygxg-YXNBJNzfzZsD...

skrause · on June 6, 2022

Latest update about this project: https://pyfound.blogspot.com/2022/05/the-2022-python-languag...

And the HN discussion: https://news.ycombinator.com/item?id=31348097

bratao · on June 6, 2022

I really wish this project didn't die, this would open to so many use-cases. I hate to use multi-process to scale my projects.

matsemann · on June 6, 2022

Yeah, while it in some regards is nice having multiple small deployments of our python app than can be scaled up and down independently, it's sometimes such an hurdle to something "extra" in a smaller app. Have a simple webserver and want to do some things out of band? In jvm I would use a concurrent queue and add tasks to it, and spin up some threads consuming that. In python you either spin up a new process and deal with the communication problems, become a master of greenlet and python internals, or most likely default to celery which need it's own deployment.

It sometimes boggles my mind how python is considered the quick and easy way for startups, while at the same time doing trivial things become such a hurdle.

throwaway894345 · on June 6, 2022

> Have a simple webserver and want to do some things out of band? In jvm I would use a concurrent queue and add tasks to it, and spin up some threads consuming that. In python you either spin up a new process and deal with the communication problems, become a master of greenlet and python internals, or most likely default to celery which need it's own deployment.

This is probably risky in production anyway, because the load balancer (typically) has no insight into these background tasks and will happily kill a process/pod/etc that is running a background process. You should probably dispatch the workload to an external task runner (e.g., Lambda or a Kubernetes Job or similar) unless you really don't care if the background task gets killed mid-flight. (I've had a few dev teams ignore these warnings and then blame infrastructure when their background jobs got killed mid-flight occasionally).

matsemann · on June 6, 2022

Yes, that is true. My case here was about pushing some metrics somewhere, where losing some weren't an issue, but I didn't want to block main thread doing it.

But this single java app on some EC2 instance could do what you need 10 "apps" and possibly a complicated k8s deployment to handle with python. Some just because the raw performance of python is far worse, but most of it because of cases like this, where simple things can't be shared. So lots of unnecessary complexity compared to "old and verbose" java.

Another example is prometheus metrics. The current app I'm working on doesn't have a webserver. Which makes it really awkward in python to add prometheus. Since adding an endpoint to my app creates a new process and needs to be deployed almost as a sidecart, there is no smooth way to actually get the metrics from my main app to the endpoint that can be scraped since they don't share the same process.

ledauphin · on June 6, 2022

it's a huge hurdle in any scenario where you have large, static dataset from which you want to derive computation-heavy results. Multiprocessing generally requires making copies of that data, which can be impossible. The workarounds (e.g. mmap) take an order of magnitude more engineering effort.

dekhn · on June 6, 2022

I agree 100% (I come from a "use all the cores with threads and shared memory to get linear speedups on many workloads" world). But I think at this point the python leadership has committed to the "every time the GIL-removal looks plausible, find some more single-core speedups to stave off the change". Put another way: I've given up on the hope that python will ever multicore the way I think it should multicore.

throwaway894345 · on June 6, 2022

I'd be tickled if Python would singlecore the way I think it should singlecore. Unfortunately, it remains considerably slower even than JavaScript (never mind Go, Java, etc). A lot of bad choices that can't easily be undone without breaking compatibility for a handful of packages, and the Python leadership seems unable or unwilling to guide the community through those changes.

onox · on June 6, 2022

I had a similar problem last month where I needed to compute some results on multiple datasets (with a little bit of concurrency per dataset). In the end I just ported the code to Ada which makes it very easy to create task hierarchies. You can define variables of task types on the 'stack' and the program leaves the enclosing block when the variables can go out of scope after every task has finished. I can now easily put all cores to maximum use. The list comprehensions in Python became a little bit more verbose though because I had to use an older revision (2012) instead of the new 2022 revision.

orm · on June 7, 2022

Have you checked out the Ray library? I use it for this type of scenario and it works well, without as much effort as it would take otherwise.

d0mine · on June 6, 2022

SharedMemory is a part of the multiprocessing module

samwillis · on June 6, 2022

There is a breakdown of the test implementations here: https://pyperformance.readthedocs.io/benchmarks.html

I'm particularly happy to see the django_template test showing so much improvement. The implementation of Django templates has been a hot topic for speed improvements for along time (and why many people jump to Jinja). Its a good example of how "complex" python will be improved by these optimisations.

Two interesting ones to compare are pickle_pure_python and json_loads (both doing (de)serialisation activities), the former shows a large improvement, the latter much less so. Again it's showing how the improvements are coming to complex pure python code, json.load is mostly c whereas the pickle_pure_python test is run against the pure python Pickle implementation.

It would be interesting to see a benchmark based on the Django test suite, that would probably be quite indicative of the impairments we can expect in real world web server Python.

m_ke · on June 6, 2022

Another interesting thing coming soon is PEP 690: Lazy Imports, which should speed up start time and lower memory usage. See https://www.youtube.com/watch?v=ohTPzi9Lry0

tyingq · on June 6, 2022

How this happened with PHP was interesting to watch from the side. The competition that came from Facebook's Hack/HHVM seemed to be the catalyst, and helped set expectations for some benchmarks.

dash2 · on June 6, 2022

I remember Perl 6 (now Raku) used to advertise itself as "the first 100 year programming language". I wonder if actually that could be Python. It seems like the update process has a lot of "taste", in that it adds features that are really useful and yet the syntax still looks clean and intuitive.

samwillis · on June 6, 2022

"the first 100 year programming language" is definitely JavaScript. (If thinking in terms of Perl/Python like scripting and versatility, otherwise clearly C)

qsort · on June 6, 2022

Depends on what you mean by "100-year language". If it's in the pg sense (i.e. a language that's on the main branch of the evolutionary path of programming languages) I think the answer is clearly the ML family.

If you mean "what language is most likely to be around in 2100 AD", realistically we'll have C for as long as we'll have computers.

ska · on June 6, 2022

Lisp has a 20 (25?) year head start; it's not more dead than usual.

rat87 · on June 6, 2022

I was thinking probably Cobol will be first since it has at least 63 years already but apparently LISP is technically a year older. Fortran is a year older but I'm not sure it will last as long as Cobol/Lisp (it's losing out to python for simplicity and is likely to bleed a lot to c/c++/rust for science libraries/things that require faster glue code)

chucky_z · on June 6, 2022

throwawaymaths · on June 6, 2022

Between python, JavaScript, perl, c, c++, and Erlang... My money's on Erlang.

Note that Erlang has a two year head start on perl (and 5 on python), and Elixir has a hard Erlang dependency.

__d · on June 7, 2022

C 1972, C++ 1985, Erlang 1986, Perl 1988, Python 1991, JavaScript 1995.

C is half-way to 100: an amazing fact in the rapidly-changing world of technology. Of all these, Perl seems to have suffered the biggest loss of popularity so far, while Erlang has the least popularity to lose.

I'd be somewhat surprised if any of these are still in common use in 2072, but OTOH, I'd also be surprised if any of them were completely unused, with the possible exception of Perl.

layer8 · on June 6, 2022

Thinking about which languages might still be prevalent at the end of this century, Lisp and Prolog (in their respective niches) would be my candidates. Maybe also some form of C/C++, and Bash. Everything else I wouldn’t be so sure about. Languages whose typical software projects have a high turnover (like JavaScript) or are based on a virtual machine (JVM, CLR) are less likely to persist.

m463 · on June 6, 2022

A C G T

:)

mbreese · on June 6, 2022

Well, those are at most op-codes. The language is significantly more complex and everything is extremely context sensitive. Plus, it requires one hell of an interpreter to get it to work.

Then again, if we're comparing it to Perl 6/Raku...

Qem · on June 8, 2022

The First gigayear-language.

throwaway894345 · on June 6, 2022

Nah, elementary particles ftw

your_challenger · on June 6, 2022

Looks like with "PEP 659: Specializing Adaptive Interpreter"[1] and typing[2] we're slowly building a faster language

[1] https://docs.python.org/3.11/whatsnew/3.11.html#pep-659-spec... [2] https://docs.python.org/3/library/typing.html

kzrdude · on June 7, 2022

How is typing making Python faster?

funny_falcon · on June 7, 2022

Typing allows reduce runtime type-checking. It allows to use faster specialized byte-code. It will help JIT to build faster code.

pjmlp · on June 6, 2022

One day it will finally be an Algol based Lisp, with a proper JIT on the reference implementation.

https://norvig.com/python-lisp.html

Yeah, I know, at least my UNIX scripting gets a bit of performance boost.

NeverFade · on June 6, 2022

Unfortunately, there are no signs that CPython is moving towards integrating a JIT after all these years and despite its massive popularity. There's therefore no clear path towards substantially improved CPython performance, and it remains among the slowest mainstream languages, really the slowest among its top group of 4-5 most popular languages.

woadwarrior01 · on June 6, 2022

There's a lot of performance to be gained in cPython by improving the runtime. Also, one could argue that the quickening pass[1] they've added in 3.11 is a proto-JIT. Baby steps, I guess.

In the past, I've seen quite a few attempts at bolting on simple template JITs onto the cPython interpreter loop[2], with lacklustre results. They'll eventually need a JIT, once all the easy wins in runtime perf have been exhausted.

OTOH, I'm glad that runtime performance is finally getting the attention it deserves from the cPython core devs. This wasn't the case, just a few years ago.

[1]: https://peps.python.org/pep-0659/ [2]: https://github.com/python/cpython/blob/main/Python/ceval.c

NeverFade · on June 7, 2022

> They'll eventually need a JIT, once all the easy wins in runtime perf have been exhausted.

How soon is "eventually"? :-)

Python is 31 years old at this point, and incredibly popular. If JIT hasn't happened yet, what eventuality will finally make it happen?

For comparison, Javascript had incredibly mature JIT implementations by the time it reached its 15th year of existence.

kzrdude · on June 6, 2022

JIT has been mentioned in the faster-cpython work though. But you're right to be skeptical, they are putting it off too ("it's not 3.11 and maybe not for 3.12"), and it's not something that can be added easily between two releases.

pjmlp · on June 7, 2022

Hence my "Yeah, I know", and pointing out that I only use it for UNIX scripting.

adgjlsfhk1 · on June 6, 2022

R is typically 5-10x slower, but other than that, yes.

cjalmeida · on June 6, 2022

Like Julia :)

UmbertoNoEco · on June 6, 2022

Without the fanboyism and the excessive corporate slant

amkkma · on June 7, 2022

You think python is free of those things?

pjmlp · on June 7, 2022

Yes, big Julia fan, although there are domains where Julia might never come, like sysadmin.

your_challenger · on June 6, 2022

Hahah the age-old py question "Is there a JIT compiler?"[1]. No

[1] https://docs.python.org/3.11/whatsnew/3.11.html#faq

rbanffy · on June 6, 2022

I don't think it's that much of a problem. Python fills a niche where performance is not a huge issue. When you compare languages, you look for a balance between how much development costs, what's your time to market, and how much money you'll spend in hosting. Time to market and cost of development usually are more important than hosting costs unless you know you'll be deploying to hundreds or thousands of machines. Most apps never grow that large.

mountainriver · on June 6, 2022

So glad they are doing this and it’s just the start from what I understand. We also may have actual concurrency in python soon.

dekhn · on June 6, 2022

I think that's actually "actual parallelism", not concurrency. Concurrency is multiple processes making forward progress using a single processor and some form of blocking/switching, while parallelism takes advantage of multiple cores. Personally, I consider concurrency a degenerate form of parallelism limited to multiplexing processes on a single core.

pertymcpert · on June 6, 2022

Concurrency is really a higher level of abstraction than parallelism. It's more of a program structuring concept.

onphonenow · on June 6, 2022

A lot of credit for these improvements come from that gil removal effort. I very very much hope we get it or a well supported branch with those changes.

gshulegaard · on June 6, 2022

Can you elaborate on what you mean by "actual concurrency in python soon"?

samwillis · on June 6, 2022

I believe some of these improvements were part of the same work to eventually remove the GIL (global interpreter lock), albeit not tied to it. However, the final decision on removing the GIL hasn't been made.

gshulegaard · on June 6, 2022

Ah yes, this is what I thought was the current state. I was curious if a GIL removal decision had been made that I had missed, but seems like it's still a soft "maybe":

https://lukasz.langa.pl/5d044f91-49c1-4170-aed1-62b6763e6ad0...

Edit: And to be fair, you can do "true" concurrency in Python you just have to eject to Multi-processing (with the tradeoffs that implies), so I didn't want to assume the comment was about GIL removal.

axblount · on June 6, 2022

Sam Gross has been working on a branch that removes the GIL and has a few other performance improvements. More details are here on the python mailing list: https://mail.python.org/archives/list/python-dev@python.org/...

There's a link to his design overview document in the thread.

AlphaSite · on June 6, 2022

There was an experimental branch where someone reimplemented Python 3.9 sans the Gil and managed to maintain performance parity with the Gil’ed version. It’s likely this performance uplift is through the various optimisations that PR implemented.

bratao · on June 6, 2022

The majority of the improvements come from the Faster-CPython (https://github.com/faster-cpython/ideas) project, sponsored by Microsoft (Thanks!!)

pid-1 · on June 6, 2022

I just formatted my laptop to try Ubuntu 22.04 out.

Thanks for remind me I need to install an ad blocker. Seriously, I counted ten banners for the whole article.

kzrdude · on June 6, 2022

https://ublockorigin.com is the one to recommend

Sin2x · on June 6, 2022

That's only testing beta 1, so there's going to be a lot more presents.

kzrdude · on June 6, 2022

I think there won't be any radical changes during betas, they probably end up in the next release.

Sin2x · on June 6, 2022

Radical no, that would require to add JIT which is planned for 3.12 AFAIK. But there are two performance improvements after release of first beta already: https://speed.python.org/

kzrdude · on June 6, 2022

I've been following what PRs the faster-cpython people merge and it's been nothing on the beta branch, I think. For example this PR is not getting merged, it seems. https://github.com/python/cpython/pull/93379

Sin2x · on June 7, 2022

That's only 7 days ago. I might be wrong, but I think Python betas are not like RCs -- they are changing things, not just polishing.

TekMol · on June 6, 2022

Is there an easy way to use Python 3.11 on Debian?

pypanta · on June 8, 2022

Try with this tutorial: https://pypanta.github.io/how-to-install-python-from-source-...

I also using Debian 11, and this is how I always install Python from source:

    wget https://www.python.org/ftp/python/3.11.0/Python-3.11.0b3.tgz

    tar xvzf Python-3.11.0b3.tgz
    cd Python-3.11.0b3.tgz

    sudo apt install libssl-dev zlib1g-dev libbz2-dev liblzma-dev libffi-dev tcl-dev libgdbm-dev libsqlite3-dev libreadline-dev tk tk-dev libmpdec-dev

    ./configure --enable-loadable-sqlite-extensions --enable-shared --with-lto --enable-optimizations --with-system-expat --with-system-ffi --with-computed-gotos --with-system-libmpdec --enable-ipv6 CC=x86_64-linux-gnu-gcc

    make
    sudo make altinstall

TekMol · on June 9, 2022

After that, I have a command "python 3.11" but when I try to run it, I get this:

python3.11: error while loading shared libraries: libpython3.11.so.1.0: cannot open shared object file: No such file or directory

whalesalad · on June 6, 2022

Install dependencies with apt (a few popular pages on google for this). Download tarball. Configure. Final step is ‘make altinstall’ and then you can use it standalone. If you already use virtual environments you would then recreate your venv and the rest of your experience would remain the same. I’d share actual syntax but I’m mobile right now.

Alternatively a tool like pyenv makes this much easier.

TekMol · on June 6, 2022

This might be above my paygrade, but let me try:

>>> Install dependencies with apt (a few popular pages on google for this).

I don't know what this means :) So let's go to step 2:

>>> Download tarball.

Me tries this:

    wget 'https://www.python.org/ftp/python/3.11.0/Python-3.11.0a1.tgz'

>>> Configure.

What does this mean?

>>> make altinstall

I guess I somehow need to untar the tar thing first?

Me tries this:

    tar -xf Python-3.11.0a1.tgz

Hurray, I have a directory :)

Let's cd into it:

    cd Python-3.11.0a1

And now the make altinstall?

    make altinstall

Gives me:

    bash: make: command not found

Hm...

    apt install make
    make altinstall

Gives me:

    make: *** No rule to make target 'altinstall'.  Stop.

Hm... yeah, looks like it is beyond my paygrade.

teddyh · on June 6, 2022

> >>> Configure.

> What does this mean?

It means to run the “./configure” script first, which is normally present in the unpacked tar file. Also, on Debian, to compile programs, you usually first have to install the “build-essential” package.

TingPing · on June 6, 2022

> > Install dependencies with apt (a few popular pages on google for this).

> I don't know what this means :) So let's go to step 2:

`apt-get build-dep python3`

heavyset_go · on June 6, 2022

Check out pyenv, it will automatically build and install 3.11b3 if you ask it to.

TekMol · on June 9, 2022

How do you ask it to do so?

heavyset_go · on June 11, 2022

This should work assuming you've got your compiler toolchain installed:

    pyenv install 3.11.0b3

butterisgood · on June 6, 2022

Python for Netgroups

EdSchouten · on June 6, 2022

*Workgroups

butterisgood · on June 15, 2022

D'Oh!