The blog post mentions it brings a 1-5% perf improvement. Which is still significant for CPython. It does not complicate the source because we use a DSL to generate CPython's interpreters. So the only complexity is in autogenerated code, which is usually meant for machine consumption anyways.
The other benefit (for us maintainers I guess), is that it compiles way faster and is more debuggable (perf and other tools work better) when each bytecode is a smaller function. So I'm inclined to keep it for perf and productivity reasons.
Being more robust to fragile compiler optimizations is also a nontrivial benefit. An interpreter loop is an extremely specialized piece of code whose control flow is too important to be left to compiler heuristics.
If the desired call structure can be achieved in a portable way, that's a win IMO.
There was a plan for a 5x speedup overall looking for funding back in 2022. Then a team with Guido and others involved (and MS backing?) got on the same bandwagon and made some announcements for speeding up CPython a lot.
Several releases in, have we seen even a 2x speedup? Or more like 0.2x at best?
Not trying to dismiss the interpreter changes - more want to know if those speedup plans were even remotely realistic, and if anything close enough to even 1/5 of what was promised will really come out of them...
The faster cpython project was for 5x over the 3.10 baseline. CPython 3.13 is currently running at something like 1.6x speed compared to 3.10. With the JIT enabled it goes up by a few more single digit percentage point. With the changes in 3.14 it'll be something like 1.8x speed-up.
So it's slowly getting there, I think the faster cpython project was mostly around the idea that the JIT can get a lot faster as it starts to optimise more and more and that only just got shipped in 3.13, so there's a lot of headroom. We know that PyPy (an existing JIT implementation) is close to 5x faster than CPython a lot of the time already.
There's also now the experimental free-threading build which speeds up multithreaded Python applications (Not by a lot right now though unfortunately).
There has been a lot of speed ups coming out of that project. It's not being done in one go, though, but spread out over the last several releases. So while individual speedups may not look that significant, remember that it is compound on the previous release's speed up.
> His academic and commercial work is focused on compilers, virtual machines and static analysis for Python. His PhD was on building virtual machines for dynamic languages.
This dude looks God-level.
Half-joking: Maybe MSFT can also poach Lars Bak of Google V8-fame.
The blog post mentions it brings a 1-5% perf improvement. Which is still significant for CPython. It does not complicate the source because we use a DSL to generate CPython's interpreters. So the only complexity is in autogenerated code, which is usually meant for machine consumption anyways.
The other benefit (for us maintainers I guess), is that it compiles way faster and is more debuggable (perf and other tools work better) when each bytecode is a smaller function. So I'm inclined to keep it for perf and productivity reasons.