> “We need to lay out a plan of how to proceed,” remarked Pablo Galindo Salgado. “Just creating a PR with 20,000 lines of code changed is infeasible.”
One of the saddest things about popular open-source projects like Python is how inevitably the maintainers become worn down and jaded over time, due to constantly filtering and protecting Python from dumb or nonsensical ideas.
Then when something like this comes along, a true game changer, maintainers have no energy or goodwill left to collaborate on these infinitely important initiatives, like removing the dogdamn GIL. This is way more significant than everything I'm aware of that's happened since the birth of the language. All other things have been, trivial miuntiae in comparison.
It bears repeating: "everything else has been trivial minutiae in comparison to a GIL-ectamy for Python."
It cannot be understated, removing the GIL will be a HUGE deal for Python! It's an ugly wart which has existed for about 30 years, and nobody else has produced and delivered a working solution to the community.
I wish Team Python would welcome GIL Eradication with open arms and a supportive attitude. This would look like focusing on helping identify and implement solutions to the impediments rather than simply pointing out problems and then helicoptering away.
> There was also a large amount of concern from the attendees about the impact the introduction of nogil could have on CPython development. Some worried that introducing nogil mode could mean that the number of tests run in CI would have to double. Others worried that the maintenance burden would significantly increase if two separate versions of CPython were supported simultaneously
The 2021 workstation I am typing this on has 16 physical and 32 virtual cores, and I expect core counts to continue upward this decade. While the CPython core devs may be excellent programmers and do a good job at maintaining Python, they clearly have a bit of trouble with cost/benefit analysis if the complaint is that the test count will double in CI for a ~16X increase in the amount of code that can be run in parallel on even consumer machines. Yes, yes, I know that this does not mean my programs will run ~16X as fast. Yes, I know there are other objections. But this is an order of magnitude away from being an actual blocker and the fact it was brought up at all as an objection shows a fundamental disconnect between the small potential costs imposed on core devs, and the huge potential upsides for Python users everywhere.
> they clearly have a bit of trouble with cost/benefit analysis if the complaint is that the test count will double in CI for a ~16X increase in the amount of code that can be run in parallel on even consumer machines
You can already use multiple cores by writing C extensions that release the GIL and with multiprocessing. That double CI cost and additional work isn't just for core python but for the entire ecosystem.
Let’s not forget that writing C extensions is the easy part.
The even more sucky part is to distribute them and make sure it works on every OS, without every user having to apt install build-utils before they can pip install your package and then spend rest of the day debugging some rare compilation error because of a header file mismatch with what’s installed on the system. The python packaging space is already complicated enough even without considering native modules.
The amount of additional complexity from a C extension is dramatic in any sort of Python application. and the performance hit from having to pickle all objects that you would like to share between processes when using multi-processing is significantly non-trivial.
True shared memory threads within a single process would be a major boon.
a) This only works for some applications and a very bad fit which does nothing for others. Namely, this only works if 1) you have few hot code locations 2) that code uses data structures which can be feasibly mogrified into native data structures with Python accessors 3) the granularity is low, meaning few invocations doing lots of work each. If any of these conditions aren't met, "native extensions invoked by Python threads" tends to be ineffective but maintenance-intensive.
b) Introducing native extensions means deployment and distribution becomes more difficult, and introduces a whole new and large class of issues and caveats into a project.
c) Native extensions are not written in Python. (Yes, Cython exists, no, it's usually not a good idea to write more than glue in it).
1. Learning a new language is non-trivial for many people (and don't sneer - it's about time not competency)
2. The ecosystem matters. If the code you want to interface with is in Python then "don't use Python" is just glib.
I'm mainly proficient in Javascript, Python and C#. My choice of language is rarely based on "which is best for this task?" but mostly "want do I need to run on and interface with?"
regarding point 1 if we are already at the point that you have to write a C extension to workaround language lossage, we can already assume you know C. At that point it is just easier to rewrite in another language that dealing with FFI.
Not everybody agrees with you that it's a worthy goal. Personally I have little desire to see it removed, doubt its removal will be achieved and tend to think the various attempts to do it are misguided.
I could imagine getting pretty jaded as a python maintainer having to keep pointing out why the latest attempt won't work. I think the onus is on those who want this to demonstrate that it can be done successfully and lay out a plan as to how it can be put into the real world without causing chaos in a python ecosystem that has only relatively recently got over the trauma of python 3 (which is the bit I doubt is possible).
Edit: To be clear, I don't think anything that consists of a set of proposed changes to the cpython interpreter is even beginning to attempt to think about the implications for the ecosystem, which is where the actual challenges are, and I'm assuming that's what the "Just creating a PR" comment is saying.
FTA: Overall, there was still a large amount of excitement and curiosity about nogil mode from the attendees.
⇒ I don’t see this as a team of worn-down maintainers, but rather as a team of seasoned developers, who know you don’t turn a project with millions of users around on a dime.
One question I miss is whether the set of locks this change introduces is the optimal one.
I would think that, if it were to be improved upon in the future, C extensions would have to be changed again, so you’d rather not do that often.
> who know you don’t turn a project with millions of users around on a dime
Hasn’t this gil/nogil saga been going on for ~decades? What about fixing packaging or limiting the C-extension interface? I have enormous sympathy of what the Python maintainers are up against, but “on a dime” seems hard to justify.
Python nerfing C extensions would be like McDonald's deciding to de-emphasize burgers on their menu. Sure, other foods are healthier and get more positive buzz on Twitter. But it turns out that the burgers are still the main thing bringing people to the restaurant.
I wasn’t talking about nerfing C-extensions. You can get the same power out of a smaller, well-defined interface that would give the interpreter a lot more leeway for optimization. This optimizations would therefore make native Python packages more feasible, making the ecosystem less dependent on C extensions. Anyway, I don’t think people use Python because of the C-extensions packages (although there are certainly some exceptions e.g., pandas), I think people use C extensions because the native performance is so poor.
This analogy is I think also favourable for the sea change: cecause we need to cut down on beef use due to climate reasons, and unless we plan to crash the climate, the end of bargain priced meat means that McDonalds needs to switch from meat to other kinds of burgers.
"It cannot be understated, removing the GIL will be a HUGE deal for Python! It's an ugly wart which has existed for about 30 years, and nobody else has produced and delivered a working solution to the community."
If it's such a huge game changer why don't some of the large enterprises which rely on Python fund this work?
Remember when it was discovered that arc welders could produce Carbon 60, and there was a huge run on arc welders? All of a sudden there was a whole new class of customer beyond the usual customer base of people who do welding: people who do materials science research.
I would hazard a guess to say that 99% of production Python is people doing the equivalent of welding, but there is also this 1% who want the GIL to be gone so they can repurpose Python as a tool for a whole new class of problem solving.
It will be an exciting future for them.
For us welders, we’ll carry on gluing stuff together blissfully unaware of the GIL. Async and non-blocking IO are great. I don’t think I’ve ever needed compute-concurrency, not in Python anyway.
You may think you don’t need it but I suspect if you program larger than trivial python applications there comes a point where you do but you don’t even think about how the GIL is restricting you. Think about load time for example of something that does a bunch of work to initialise. Without the GIL it would be relatively trivial to speed up things that are independent, while with GIL you are usually out of luck because you can have either shared memory (e.g. previously loaded state) or concurrency but not both at the same time. From experience, trying to serialise and use multiprocessing is often eating up all the potential gains.
I would love to parallelize a plugin script in Cura, the 3D print slicer. It does a bunch of embarrassingly parallel calculations, and could be made at least 16x faster for me. Because it's a plugin, though, it isn't pickle-able and multiprocessing doesn't work. I managed to make it work in a branch, but only on OSes that can fork processes. On windows, the plugin spawns multiple GUIs because importing the cura package apparently has the side effect of launching it...
If there wasn't the GIL, I could just create a thread pool and be done, and Cura could continue to be a delightful mess. :-)
20-odd years ago, my team solved some performance issues in our desktop app by splitting a couple of tasks into their own threads on a computer with a single CPU. Being able to write threaded code is really useful.
Some programming tasks are so trivial that processing speed is completely uninteresting.
Some are so compute heavy that massive amounts of time is put into tuning them (sort algorithms, fast-fourier transforms, etc).
Most fall somewhere on the spectrum between those extremes. If you can speed up your program by a factor of 20 by adding threads, Python can cover a bit more of that in-between spectrum.
I'm not sure I fully buy this argument. I think programming languages are chosen for projects based on a number of different factors, speed being one of them, but developer productivity being another (and there are certainly many more). Given that there isn't usually going to be one language that is the best choice for every single factor, it seems pretty natural that some teams might pick Python due to productivity being more important but still have some need for better performance, and other teams might not be able to compromise on performance and have to resort to something like Java or Go or C++ but still would be able to iterate faster with something higher level. It's definitely not a given that there are enough potential projects that would get enough benefit from removing the GIL to make it worth it, but it seems silly to claim that anyone would would get any possible benefit from performance would never choose Python for other reasons.
There are enough choices that offer Python's productivity alongside JIT/AOT options, better supported from the community than PyPy.
As for the C, C++, Fortran libraries with Python bindings, any language with FFI can call into them as well.
I would say, Python's adoption while lacking performance is what is now building pressure to avoid specific Python communities to leave Python and migrate into one of those ecosystems in search of a better mix of productivity/performance, without being forced to use two languages.
Ooh, a Clojure .NET implementation? This is news to me. From a quick look at the website, it looks really good... but how is it in practice? One of the major advantages of F Sharp's .NET integration is that it's developed by Microsoft, as of course they pretty much created the ecosystem from scratch.
But it is being continuously kept up to date by Cognitect, along with active development on a clojure-clr-next version. Maybe it has a bigger population of non public users.
Also: here's another implementation of Clojure on CLR, "Morgan And Grand Iron Clojure Compiler": https://github.com/nasser/magic (see also http://nas.sr/magic/) of which there's a great cmpiler implementation talk (at Clojure/North) somewhere on Youtube.
Performance is not binary. If you can be productive with Python and get closer to your performance goal, you get the best of both worlds. A lot of people who aren't primarily programmers don't have time to write everything in C++ (or keep up with development in that area). A more performant Python is a huge win for everybody.
Just like language choice isn't binary, there are plenty of options with Python like productivity, with much better performance, it isn't Python vs C++.
This is a tire fire of a thread, it's clear there's lots of confusion about the tradeoffs.
This isn't a case of "x" or "y". There is literally nothing valuable about the GIL, it's an ugly hack of a vestigial appendage. Perhaps the reason I'm familiar with it is because the lack of elegant MP threading in Python perturbed me for years, until I was introduced to Golang.
Python devs generally don't want to use Java, JavaScript, etc. And the Go ecosystem is good but not as rich as Pythons.
Anyhow, take care pjmlp. Until our paths cross again I wish you all the best!
a few years back I was curious so I took a self-balancing robot that I had written in C (it did PID in real time to set motors in response to the current angle).
I ported it to Python, with a totally naive simple single threaded loop. It worked perfectly. 25 updates a second, forever. No C code, except the interpreter doing its thing, and some GPIO code.
More than 5x can be achieved with the GIL included (and has been promised, by the new Microsoft Python-speed-up initiative and that Python dev who made a similar proposal).
> I don’t think I’ve ever needed compute-concurrency, not in Python anyway.
I have. Some very inexperienced “senior engineers” at my last gig thought it would be fine to build an analytics platform in Python because “Pandas will make it fast”. Unfortunately even modest datasets would cause timeouts and even seize the event loop, while a naive Go prototype would finish in a few hundred milliseconds.
Here's the text of a patent describing the process. [0] Googling "fullerene" will get you more useful hits than C-60, so, e.g. "generating fullerene from arc welding."
I believe the original paper on Buckminsterfullerine aka Bucky balls aka C60 described the approach. Basically do violent stuff with carbon and stick it in a mass spectrometer. The violent carbon bit is arc welding, because the arc welding tips are made of graphite.
Wikipedia informs me that the first papers used a laser (https://www.nature.com/articles/318162a0 - "graphite has been vaporized by laser irradiation, producing a remarkably stable cluster consisting of 60 carbon atoms") (1985) and the arc welding approach was a few year later. The paper "Solid C60: a new form of carbon" (1990) paper describes using benzene to dissolve the soot to extract the C60 (https://heyokatc.com/pdfs/MISC/Solid_C60-_a_new_form_of_carb...).
Sam Gross works for Meta, and to my knowledge, this is his #1 project. So in effect, one of the largest companies depending on Python is funding this work.
I don't know if this is still true, but a few years ago there was only a single Python core developer that worked on it full time (Victor Stinner from Red Hat). The rest were only able to devote somewhere between few hours to one day a week.
> If it's such a huge game changer why don't some of the large enterprises which rely on Python fund this work?
Because lots of code in python relies on the GIL to act as a synchronization primitive for threaded code (either explicitly or implicitly) and removing the GIL will break that code in subtle ways and there’s no appetite in those companies to fix that.
It doesn't sound like the reservations of the people in charge of Python are about funding; I don't think it's as simple as "pay them $1m to merge the PR and then pay them salaries as they maintain it". Unless the assumption is that the people in charge of Python would take a bribe, I don't see how large companies could somehow magically make this happen with money short of just forking Python and trying to maintain it themselves, which would burn a ton of goodwill and likely not even win over the community.
Because it's not a huge game changer.
What would you do differently if the GIL was gone?
when people need speed in Python they write C/C++ extensions or use multiple processes/computers
1 - This helps parallelize regular python code too, like the kind that goes into writing web services.
2 - While you'd still write native extensions to get optimum performance for single threaded code, having the thread dispatch layer in python makes parallelizing execution convenient and robust. For example, see the comment from people who maintain scikit-learn below. I'd love to see python get composable task parallelism like Julia where you can freely spawn threads within threads without worrying about their lifetimes.
Well for web dev it would be simpler because you can get rid of workers which in turn reduces the overall memory footprint of applications since the code is shared, the local caches are shared and the connections are shared.
Edit: For backend services it would probably put Python around one order of magnitude away from Go in terms of memory usage to serve the same load, enough for teams not to consider switching.
Writing C/C++ extensions is super hard, and not something > 90% of the Python users is willing to pick up. I would love for Python to support a Parallel.ForEach statement as is supported in C#, or for Pandas to support parallel operations.
Here's an example: embedding Python in other applications. Out of the box it's impossible to just start running scripts at will from multiple threads. Which happens to be what our application does, because as you say we want some speed/concurrency so have a backend running multiple threads C#/C++. But on top of that some things are customised in Python, and now we use IronPython which has no GIL so that all just works. Is this a huge game changer? As a whole, no, but for us: yes it would be awesome if we could just use Python.Net (which is just a layer over CPython) out of the box without having to deal with it's GIL.
What if, when people in python need speed, they have the option to just replace a for-loop with threapool.map?
Currently, that doesn't do much. Trying to do this with a processPool either works, or becomes a horrible exercise in fighting with pickle or whatever other serializer you pick.
Run an ordinary database connection pool in shared memory mode instead of offloading it to a separate process via PGBouncer. Do that with all of the other resource pools as well. Heck, for one off webservers with low load you can do in memory session handling without needing to pull in memcache or redis until you truly need multiple processes or the sessions to survive a restart.
What would I do differently? Actually make engineering choices based on the current project status and priorities without having those choices made for me.
Well, I might prefer a gofundme for some devs to do it. That way us average Joe’s can pitch into the support.
Otherwise, there’s making a fork and attempting periodic updates from python. That’d be a huge undertaking though.
FTA: “””
A lot of the value of Python is the ecosystem, not just the language… CPython really leads the way in terms of the community moving as a block.
"Removing the GIL is a really transformative step. Most Python programs just don’t use threads at the moment if they want to run on multiple cores. If nogil is to be a success, the community as a whole has to buy into it."
My thoughts exactly. Google has already done quite a lot of work on LLVM, as an example. Python is among their most used programming languages, this something they could totally assemble engineers to do.
Worker threads[0]. If you don't follow Javascript it's not surprising you hadn't heard about it, it's pretty new. Node also, thanks to v8, pretty much outperforms Python.
worker threads don't share the parent's heap though, right? They can only share state through a binary channel. It's something, but pretty different from traditional pthreads.
They can use SharedArrayBuffers[0] which wrap shared memory or they can pass ArrayBuffers. Node may not have the shared arrays yet, but the browsers do.
Python has multiprocessing.shared_memory[1] since 3.8 to share arrays between processes, which accomplishes the same thing and circumvents the GIL issue.
> I wish Team Python would welcome GIL Eradication with open arms and a supportive attitude. This would look like focusing on helping identify and implement solutions to the impediments rather than simply pointing out problems and then helicoptering away.
It looks like they are supportive and keen on the changes? From the article:
> Gross's proposal was greeted with a mix of excitement and robust questioning from the assembled core developers.
It's definitely a huge opportunity, no doubt. But it affects the C ABI[1] for Python itself, something that everyone in the conversation is aware of, and that would have implications for all distributions of Python.
Add in the importance of effective code review for a 20k LoC changeset, and it seems there are good reasons to be cautious despite the optimism and excitement.
Yes, they need a lay out a plan on how to proceed. If they don't proceed with this, it will eventually result in the slow decline of Python and its eventual demise. Multi-threading in Python is an absolutely pain, and is something that needs to be addressed. We are at the end of Moore's law and CPU speeds have reached their limit. The only way forward performance wise is to kill the GIL.
Kind of feels like Python is in a bit of a mess in no small part because the maintainers didn’t do a good job of blocking bad ideas. For example, exposing virtually the entire interpreter as the public C extension interface (can’t optimize much without breaking things, thus driving more dependence on C for performance which makes it even harder to change anything and leads to insanity in the Python packaging ecosystem).
This is asking for Guido to have predicted the multicore era and the success of Python in 1991, at a time when SMP architectures had only been commercially available on extremely expensive systems for 5 years. Much of what contributed to that success was in fact the rich C API, and the ease with which it could be embedded into larger applications, and I'm not sure what an alternative would look like that could produce efficient implementations of things like Numpy, etc.
This is what I find most interesting about the Erlang/Elixir world. The programming language design predated SMP by about 20 years - the actor model was chosen because it was a good way to reason about programs not for parallelism. Then multicore processors arrived, and it was 'just' a VM enhancement to make Erlang programs fully parallelised.
Just a nitpick -- the actor model was not chosen, the BEAM just happens to look like an actor model. The BEAM architecture was chosen for fault tolerance and the features you want for fault tolerance naturally guide you to something actor-ish. Also, the actor model says very little about fault tolerance concern (its concern is in improving computational efficiency bottlenecks)
> This is asking for Guido to have predicted the multicore era and the success of Python in 1991
No, it’s not. My comment applies to single threaded performance as well (I wasn’t even thinking about the GIL when I made my comment)—why isn’t Python as fast as modern JS implementations, for example? The answer isn’t “JS has multi thread support”.
Moreover, a slim C extension interface is about flexibility so you don’t paint yourself into a corner when you don’t know what the world might look like tomorrow. Further, you can have a very rich C extension API without exposing the entire interpreter (e.g., h.py). Further still, Python has broken compatibility several times since its inception, so the idea that this was cemented in 91 is nonsense.
> why isn’t Python as fast as modern JS implementations, for example?
Because the project has explicitly targeted implementation simplicity, largely successfully, for almost 30 years. The internals are a joy to work on, and unsurprisingly the CPython Git repository has 4x as many contributors as v8, despite CPython contribution being largely voluntary and v8 contribution being largely commercial.
Even if performance were an explicit goal, it's important to remember v8 required absolutely massive funding and top tier engineering support to make it happen at all. The most comparable equivalent in the Python world, PyPy, was the product of an extremely dedicated group of mostly doctoral researchers working against incredible odds. V8 only has 2x as many contributors as PyPy. I hope by now you are recognizing a theme: the reason the language is so successful is also the reason we are here complaining about it.
There have been teams in at least Google and Dropbox who proposed major upheavals of the interpreter in the past. Both failed in large part due to the complexity of their proposals compared to the performance gains on offer
You're setting up a dichotomy between a small C-extension interface and code simplicity, but this doesn't make sense. A smaller interface is inherently simpler--it's less for developers to work around, and they can deliver more user value (whether performance or otherwise) per unit complexity.
> The most comparable equivalent in the Python world, PyPy, was the product of an extremely dedicated group of mostly doctoral researchers working against incredible odds
"incredible odds" refers to compatibility with the CPython C-extension interface, which is exactly what I'm talking about.
> There have been teams in at least Google and Dropbox who proposed major upheavals of the interpreter in the past. Both failed in large part due to the complexity of their proposals compared to the performance gains on offer
No, they failed because they had to work within the considerable constraints imposed by historically bad decisions (such as the C-extension interface). The proposals need to be complex because they can't break compatibility.
> I hope by now you are recognizing a theme: the reason the language is so successful is also the reason we are here complaining about it.
Not at all! A narrower C-extension interface doesn't imply that C-extensions would be more difficult to write. There are no downsides to a narrower interface (apart from breaking compatibility, but we're positing a world in which this decision was made in 2008 or earlier).
The real theme here is that historical bad decisions + compatibility guarantees add significant complexity to every single improvement if they don't preclude them altogether.
I think a more reasonable timeframe for this discussion is the release of Python 3 in 2008. That was much more recently, clearly in the multicore era, and a major opportunity to do a breaking change.
There were multiple discussions. The main issue was removing the GIL made single-threaded Python. Experiments in 1999 found single threaded performance was 50% slower - you would need 2 cores just to break even, assuming you had parallel code in the first place.
> I'd welcome it if someone did another experiment along the lines of Greg's patch (which I haven't found online), and I'd welcome a set of patches into Py3k only if the performance for a single-threaded program (and for a multi-threaded but I/O-bound program) does not decrease.
Well, all credit to Sam the author, but he "bought" that no degraded performance by adding performance optimizations! Unfortunately, Python 3.11+faster-cpython work has already superseded some of it, i.e negating the trade-in he's trying to accomplish! However, I still hope everyone helps out and that the nogil work can land as default in CPython.
> Unfortunately, Python 3.11+faster-cpython work has already superseded some of it,
Why is that unfortunate? If his patch being slower than python 3.11 isn't acceptably fast despite being faster than every other python version before that then python was never acceptably fast to begin with.
Linux got rid of the BKL ages ago, that the Python community still holds onto the GIL as if it was some multi threading holy grail isn't even remotely funny anymore.
It's unfortunate because I think it makes it slightly less likely that his work is accepted, now that part of the sweet single thread performance gains are already supplied from another source.
>Unfortunately, Python 3.11+faster-cpython work has already superseded some of it, i.e negating the trade-in he's trying to accomplish!
If "Python 3.11+faster-cpython" didn't also remove the GIL, then it didn't negate anything he's trying to acomplish. He wasn't going for a faster Cpython alone, but for a Cpython without GIL.
The first Gil-removal patch I'm aware of was against python 1.4, so it's not like people haven't been trying. The problem has never been removing the GIL per se, but removing the GIL without slowing down the language when running single thread
You could compile Python using the --without-threads configure option, before Python 3.7 [1] or thereabouts.
[1] 3.7.0 is the first major release following the removal of the config option in https://github.com/python/cpython/issues/75551 .. there's also a 2021 comment about "This has unfortunately turned out to be a blocker on getting WASI support as there's not direct threading support in WebAssembly."
Also: there were lots of independent projects to remove the GIL/speed up Python: from Google, Dropbox, and others.
But due to the leadership/community model/aversion to big changes (except for alienating people with 2 to 3 changes) none of this was in the context of CPython, they were independent versions, that never caught on, and didn't leave anything (or much) behind to the regular CPython when they folded.
> It cannot be understated, removing the GIL will be a HUGE deal for Python! It's an ugly wart which has existed for about 30 years
I would argue the opposite: it's the secret to Python's success. It might even be my top example of how "worse is better" plays out in real life.
I agree, the GIL feels like an ugly hack. The software engineer in me wants to hate it so much. And, now that I'm on the far side of a successful transition to a data science role, one might think that I hate it even more, yeah? Because the work I'm doing depends so very heavily on compute performance and parallelism.
But it turns out, nah, I'm coming to like it. It's an ugly hack, but it's the best kind of ugly hack: one that gets the job done.
Because I'm pretty sure that the GIL is the secret sauce that makes Python C extensions work so well. Without it, it would be much more difficult to write safe, correct C extensions. Doing it without introducing race conditions that destroy memory safety would be a black art. So people would probably do it less. And that probably means no robust Python scientific computing or data science ecosystem, because that stuff is all C extensions.
We could instead use a C FFI, like it's done in other languages. Java, for example. But Java having to use an FFI and Python being able to use its C extension mechanism is exactly why Python has eaten all of Java's Wheaties in the data space. The marshaling costs of talking back and forth across that boundary are just too high. Copying goes up, locality of reference goes down, cache misses go up. You saturate the memory bus more quickly. Once you've done that, it doesn't matter how many threads you have running on how many cores. The bottleneck is happening outside the CPU. Top will happily tell you those cores are working hard, but it turns out that what they're working so hard at is sitting around and waiting.
This isn't just theoretical. Last year I replaced a heavily parallelized compute-heavy batch process written in Java with a Python implementation that got the work done in less time despite being single-threaded. Sure, the Python implementation was basically a little scripting on top of a library written in C++, and the Java one was pure Java. But that's kind of the whole point. I also know that, back when I wrote the original Java implementation, I tried the same trick of farming the calculation out to a C++ library, and it actually made the throughput even worse. JNI is a harsh master.
And besides, as others have said, numpy & friends give me most the parallelism I actually need, for free.
Maybe it hurts other people more? Maybe Web developers? But there's a part of me that thinks, if you're trying to do that kind of work at scale, making Python go faster is perhaps barking up the wrong tree. There are plenty of other languages that are statically typed (reducing pointer chasing and branching can increase your throughput without giving Amazon more money in the process) and don't even need a global interpreter lock in the first place because they're not interpreted, either.
I am not persuaded. I think that C extensions could handle multi-threading more easily or could simply document that they only support single-threaded use. BUT, I had never heard this argument made and I find it a very interesting and insightful one -- thank you for explaining in detail!
> It cannot be understated, removing the GIL will be a HUGE deal for Python!
I am not sure about it. Python explicitly trades performance for simplicity and the GIL simplifies a whole lot of things and because of it, there are multi threading issues that will never be encountered by Python code.
In addition, because of Python’s popularity, there is a massive amount of code written in Python (and other languages as extensions) and any change cannot introduce a bug.
Finally, there are other alternatives to Python that are more performance focused.
If you are having huge issues with Python performance, maybe you are using it in a matter it was not designed for.
Can anyone elin (explain like I am noob), why should the Gil be removed? Why need concurrency in normal python scripts let’s days for web scraping or machine learning ? I don’t really get it.
True but they would also be faster (probably more so) if Python had a jit compiler. Just seems that people want Python to be something other than it is. I guess the assumption is that it’s easier to remove the GIL or jit Python than the alternative, which would be to port whatever libraries to another ecosystem. Maybe a safe bet although there’s no deadline for GIL removal and that’s been a goal for decades.
>Just seems that people want Python to be something other than it is.
Well, if they didn't, we'd still be stuck with Python 1 or 0.1.
Why is the GIL suddenly where Python should keep "being what it is", and not any of those tons of changes, from 0.1 to 3.10?
Especially since the removal of the GIL doesn't change any spirit/essence of Python - just makes it faster.
Python wasn't conceived as "having a GIL" being some essential part of it, it was just a bad tradeoff for implementation convenience made back in the day where common multi-core machines were 20 years in the future...
As some other commenters have mentioned, the GIL removal branch also makes some unrelated optimizations and the performance improvement comes from those rather than from the GIL removal itself, as I understand it.
From your link:
"Stripping out some of the GIL-removal related changes would result in even faster performance on the single-threaded pyperformance benchmark suite. [...] The resulting interpreter is about 9% faster than the no-GIL proof-of-concept (or ~19% faster than CPython 3.9.0a3). That 9% difference between the “nogil” interpreter and the stripped-down “nogil” interpreter can be thought of as the “cost” of the major GIL-removal changes."
So it seems removing the GIL has a negative impact on single-threaded code, with the version that has both the GIL and the unrelated optimizations being 9% faster.
I agree and have been complaining about it for ages. Mostly got around it via using processes and microservices, but this would make my life so much easier. With general processors having tons of cores and memory now it just doesn't make sense, especially if the new changes work well with little or no performance hit. Seems like a no brainer.
It is all political at this point. Python is run by representatives of big corporations, some of whom haven't really done anything in the past 10 years but are who great at promoting their personal brands.
They are trying to strengthen corporate influence and discourage any power of traditional independent open source developers (who have contributed most of the useful features as opposed to corporate churn).
We do not know what is going on in the background, it is proxy wars all over the place. It is possible that they want to reduce Facebook influence, that their code bases do not need this feature, etc.
It may work for your codebase, but the GIL has deep implications for any multithreaded code. For a long time now we've relied on the GIL for some level of thread safety. I'm afraid removing it would cause quite a few libraries to fail in unexpected ways.
The statement that "this work is very carefully designed not to break multithreaded code" is difficult to reconcile with some of the ways it potentially breaks multithreaded code that are explicitly called out down in the "compatibility" section.
In particular, the design document's observation that "some C API extensions need modifications to protect global data structures that were previously protected by the GIL" serves as a direct confirmation that this change was not able to avoid breaking at least some multithreaded code that relies on the GIL to provide thread safety.
Which is why the proposal was always to have it behind either a runtime of compile time flag. It has never been their plan to unilaterally change the way this works (py2>3 style).
> To mitigate compatibility issues and improve debugging, the proof of concept can run with the GIL enabled or disabled controlled at runtime by an environment variable (or command-line option). If CPython adopts some form of the GIL changes, I’d expect this runtime control to be useful for at least a few releases to address flag day issues.
This doesn't say "GILectomy will forever remain an opt-in".
One example off the top of my head is socketserver in the standard library, which is used by http.server, xmlrpc, IDLE and all kinds of things.
The socketserver.BaseServer class sets self.__shutdown_request in one thread and expects it to be picked up by another. In the Java memory model this variable would have to be marked as volatile (or the methods involved as synchronized) to make sure that the other thread will actually see the change, so unless Python implicitly make every variable volatile (does it?) this wouldn't be guaranteed to work (although it would probably mostly still work most of the time except for when it mysteriously doesn't).
I'm sure there's something I'm missing here, since no-one seems to be talking about it, but how does the nogil version ensure that changes are made visible to other threads, i.e. what Java calls "safely published"?
As I understand it, a different thread running on a different core could have a cached version of a value which wouldn't necessarily be updated unless some instruction is issued to synchronize with main memory, which in Java is done with volatile or synchronized.
Also, if some optimization is implemented that reorders instructions or eliminates the variable update entirely, that could also prevent the other thread from ever seeing the updated value. This is also solved by using volatile or sychronized in Java.
Is every variable implicitly volatile in nogil Python? Or only object attributes? Or have I completely misunderstood some important aspect?
Edit: I suppose modifying the reference count might cause implicit synchronization similar to the piggybacking technique [1] in Java, making this a non-issue?
Yes, I believe every variable is implicitly volatile in nogil Python.
That said, I am not a good source for truth on this. But it feels like so much code would break if this weren't true that I don't think it would have gotten to this stage.
Maybe that's the only one? Maybe it isn't? But I think the point still stands that people saying this has the potential to break existing Python packages in subtle ways are not just being hyperbolic.
I would argue from a theoretic point of view no, but because the GIL causes execution of Python threads to be "batched", there surely will be racy code that seems to work now, and will start experiencing failures or more of them without the GIL batching.
Folks, the whole point of this effort and what stands it apart from previous GIL-removal efforts is that it's carefully designed to:
1. Give the performance-benefits of GIL removal for multi-threaded code.
2. Not hurt the performance of single-threaded code, hopefully improving it too.
3. Not break any pure Python code that relies on the semantics of the GIL.
4. Minimize impact to extensions.
If you're commenting here that this is going to break things w/o having read the design doc, you're contributing FUD. It's disappointing to see the discussion go this way.
The project aims for a concurrency model that matches the threads + shared memory model implemented in common operating systems and in programming languages like Java, C++, and Swift. We also aim for similar safety guarantees to Java and Go -- the language doesn’t prevent data races, but data races in user code do not corrupt the VM state. (Notably, Swift and C++ do not have this property: data races in those languages may corrupt VM state leading to segfaults). This strikes a balance between safety and performance. Weaker guarantees, like in Swift and C++, can make debugging difficult, while much stronger guarantees can limit flexibility or hamper performance.
Compatibility:
The vast majority of Python programs should not require any changes to Python code to continue working. Most of the issues I have encountered so far are due to being based on 3.9.0a3 and libraries using features from the 3.9 final release. A few of the issues have been due to changes to code object internals or the bytecode format, but so far I have been able to address those by making changes to the interpreter to improve backwards compatibility.
All extensions that use the Python C API will require re-compilation, even the few that are limited to the stable ABI. (The reference counting changes do not preserve the stable ABI).
To compile successfully, some C API extensions will need minor modifications, typically replacement of direct access of PyObject reference count fields with the Py_REFCNT macro.
To run successfully, some C API extensions need modifications to protect global data structures in C code that were previously protected by the GIL. For an example, see these two patches to NumPy.
I don't mean to troll, but I have to say the GIL is not and never has been a problem with Python. It's weird to me that this one tiny issue has grown into such a big waste of time, energy, and attention. I've written Python for something like twenty years now and never once had a problem due to the GIL.
It's a classic case of "you're doing it wrong": Python supports concurrency in several useful ways, and specifically does NOT support it in one particular very useful way. If you have somehow arranged to bang your head on that one specific thing that Python doesn't do, and you cannot figure out how to not bang your head on the GIL, then for goodness' sake use Go or Erlang or something that DOES do the one thing you can't live without.
- - - -
Don't get me wrong. If this succeeds it will be a great thing. This effort seems well-thought out, and I wish Sam Gross luck and success.
There is so much hate here for the GIL, which is undeserved. You tend to not notice all the nice safety it gives you. If you want to speed up your Python code you could of course try to make it run in parallel: congratulations, you just wasted N times more CPU on slow interpreted code. You could have used an accellerator language like Numba, or write parts of your code in C++ or Rust and get a 10-100 times speedup on a single core!
This is the third time you've replied to someone with this exact comment. Please don't do that. You are not being as effective at promoting your POV as you might be if you were a little more tactful.
I read the document. Sam Gross seems to have a clear grasp on the risks associated with this kind of project (I would quote the relevant part, but google docs is a clown shoe.)
One the one hand I wish him luck. If he pulls it off he'll be a hero. On the other hand this seems like a huge "turd polishing" for a problem that one only encounters if one is doing something ill-advised. The GIL just isn't that big of a problem in practice. I've been writing Python for going on twenty years without ever encountering a single problem due to the GIL.
I replied to comments that seemed ill-informed or hadn't bother to read the design doc. I have been writing Python since January 2000 or so, so I guess I maybe have 2 years on you there.
I haven't encountered any difficulties with the GIL either, but that doesn't mean it's not a problem.
FWIW, I think it's a good thing to encourage people to read the design doc. Ignorant knee-jerk responses help no one and waste a lot of time, and we can have a better discussion if participants are better informed. I don't think you were doing the wrong thing, I felt you were doing the right thing poorly. ("FUD" is pretty strong language IMO in this context.)
I really do hope Gross is successful. It does seem like he's "got an angle", so to speak, on the problem. At the very least, he's coming into it well-informed with his eyes open. He wants to try excising the GIL (and FB is apparently footing the bill to boot) so it's stupid to object, eh? Who cares if he wastes his time? He's not hurting anything. And maybe the horse will learn to sing. :)
The FUD comments came when I was on my phone and in https://xkcd.com/386/ mode. I was just disappointed seeing so many comments that were, well, FUD. FUD may be terse, but I don't think it's strong language.
There's a lot of misunderstanding around the GIL in general.
You can’t just call everything that doesn’t agree with your opinion FUD…
Because, you know, they make a valid (flamebatish) point, the ‘easiest’ way to get performance out of python is to wrap the critical code in another language as an extension. Removing the GIL isn’t going to change this no matter how careful it is designed.
FUD isn't just a meaningless phrase, it means fear, uncertainty and doubt. Nearly every comment against GIL removal makes vague claims about this causing stability issues in a large amount of code that possibly, might, perhaps depend on it without even being aware of it. Yet there is not a single example given that holds up to even basic scrutiny. This is quite literally a textbook case of FUD.
What's the point of "safety" if it's not needed anymore, as it seems not to be under nogil? If that's right, then modulo regressions/bugs possibly introduced by nogil, the choice is really just between worse or better performance, and it's not important whether there are (more effortful, by the way) ways of getting much much better performance.
Moreover, to the point about writing performance-critical sections in $NOT_PYTHON, as Sam Gross explains in the nogil design doc, things aren't always so straightforward for scientific computing:
> Calling into Python from C requires acquiring the GIL -- even short snippets of Python code can inhibit scaling. Addressing this issue sometimes requires rewriting large portions of Python code in C/C++ to actually achieve the desired parallelism. For example, PyTorch rewrote all of the core automatic differentiation from Python to C++ to avoid acquiring the GIL in the backward pass.
What about all the scripts and programs that depend on the GIL to give correct results - often without even their authors knowing about it? I guess making this opt-in with a compiler flag is ok, but I would be very, very careful about making it the default. If that ever happens, we will be one Python interpreter upgrade from Y2K-like disaster...
That's the problem: the kind of issues the GIL prevents is notoriously hard to reproduce and debug. It would probably pass 90% of the test suites out there, unless you wrote yours with multi-threading in mind. Same for empirical tests by QA teams - they rarely cause enough load on the system for the bugs to surface. I might be too pessimistic here, but this really looks to me like a disaster waiting to happen...
> There was also a large amount of concern from the attendees about the impact the introduction of nogil could have on CPython development. Some worried that introducing nogil mode could mean that the number of tests run in CI would have to double. Others worried that the maintenance burden would significantly increase if two separate versions of CPython were supported simultaneously: one with the GIL, and one without.
This seems like a key consideration for the long-term success of nogil and CPython. The Python ecosystem has survived and recovered from past forks but it would good to avoid forks altogether if possible, especially if nogil proves to be stable and performant enough for prime time. At the least we should try to keep any forks short-term, e.g. starting with a compiler flag with a timeline for shifting to a runtime flag. (These are just my thoughts, take everything with a pinch of salt.)
This would totally make sense. A major change in architecture with some libraries breaking. While many people are still store from the 2->3 transition I would not expect this change to be that severe Most code should work for python 3 and 4 without changes.
Every conversation on nogil needs to include some dialogue about how it works and what types of changes it requires, both to code and the mental model of the code.
I can’t really further consider this or get excited about it unless I understand how it’s going to work and how it addresses previous issues with removimg the Gil.
in my opinion, getting rid of the GIL is the one thing that Python can do to stave off Julia's eventual dominance of numeric computing. Not indefinitely, but by at least 5 years.
There are still many cases when data scientists don't have a C extension and need to run pure Python functions in parallel, and the GIL makes this an order of magnitude harder (and sometimes slower).
I think the core Python devs have gotten rather too conservative in the language's old age.
In terms of numerical computing I don't think this is going to be a battle of core language speed, Python has already lost here. The question is going to be what are the advantages of a language designed with a focus on number crunching vs a multi-purpose language with some number crunching extensions.
I mostly do scientific computing but I really enjoy the toolchain of an extremely popular multi-purpose programming language (Python) + C extensions where necessary for speed. The reason is that many tasks these days are not just pure number crunching. If I need to start up a quick web server, do some web site scraping, basically anything can be done easily from Python. It's unlikely Julia will ever be able to catch-up in these areas.
For numerical computation compiled code is 10x-1000x faster; even without the GIL Python cannot compete without dropping to C/Fortran (which typically releases the GIL anyway)
wouldn't removing the GIL cause all sorts of sudden race conditions and regressions in existing python code? I think there is a valid argument to keeping the GIL just for simplicity's sake
From my understanding, race conditions already exists today as the GIL is made to protect the internal Python structures. It only made it appears more infrequently.
Fewer than people expect. For example, the += operator looks atomic, but under the hood, it's made up of multiple python bytecode instructions, so the GIL won't apply here.
> For example, the += operator looks atomic, but under the hood, it's made up of multiple python bytecode instructions, so the GIL won't apply here.
Sure, but any call to a native function which doesn’t release the gil (which is most of them at least for the builtins) is currently atomic so something like dict.setdefault.
It isn't inevitable, but some people promoting Julia like to pretend that there is some sort of competition for mindshare happening. There are three basic markets for numerical computing: hobbyist, academic, and commercial. The hobbyists will always remain in Python because it is good enough and there is little benefit to learning a new language for casual numerical computing (this also applies to undergrad-level academics, which is closer to hobbyists than deep academics.) In academics it is possible that Julia will replace R for a lot of use cases, but it is unlikely to make much progress in displacing Python outside of math-heavy fields: a biologist or chemist will stay with Python because of the ecosystem and its applications outside of pure numerical processing code. In the commercial world the race is already over and Python won, it will continue to grow in this role due to simple inertia and because for cases where numerical computation speed actually matters a company can hire people to write the code in something even faster and more efficient than Julia.
I am not sure I agree with your hobbyist point. I do programming as a hobby, and I'd prefer doing it in Julia over Python anytime. I can understand that commercially you wouldn't risk adopting a new programming language like Julia when Python has a bigger mainstream ecosystem with solutions for most problems people have ran into, but for hobby I do not have these constraints; I have much more fun programming in a cleaner and more modern language like Julia withstanding its other benefits.
> Is possible that Julia will replace R for a lot of use cases
R is a interactive statistical programming language that acts as a frontend for more performance languages. AFAIK interactivity is not the strongest points of Julia at the moment.
>interactivity is not the strongest points of Julia at the moment
Not sure what you mean. Julia has the exact same notebook environment (Jupyter) as R and Python. Fun fact: the “Ju” in Jupyter stands for “Julia” (the “pyt” and “r” stand for what you think).
Few people in the R world use Jupyter environment, they are generally inferior to use the Rstudio environment. This might change in the future though, VSCode and DataSpell are becoming really good. Jupyter Labs are slowly improving too.
I think Julia may be the Rust of the Number Crunching world these days and you'll get similar advocacy while the old crowd keeps trucking along with c++/python
Though the fundamental issues with the GIL haven't changed, that page hasn't been updated in quite some time and the
recent discussions section at the bottom doesn't mention the current attempt under discussion:
Removing the GIL is surely a breaking change although not exactly a syntactic one. Putting it as Python 4.0 should clear the doubts and allow for widespread release without breaking existing code
What ever happened to PEP 554? I remember reading about that and thinking it was a pretty elegant solution to the GIL. I thought I saw something a while ago about it being expected in an upcoming version, but I see it's still in draft.
As someone who only knows Python, this gives me existential angst. Like some other people have pointed out, I think the GIL is secret sauce to Python's success. I would be vary of the religious fervor behind taking the GIL to the gallows.
It's hard to put to words what the GIL does for Python. I think the best I can do is to point to Rust and ask you to consider what Rust's number one claim to fame is? A more than significant part of Rust's bible ("the book") is about memory ownership. Python's GIL lets me be completely ignorant of these issues and all my brain cells just focus on the problem at hand.
My long term livelihood and sanity is at stake here and I really appreciate that some of core maintainers of Python are taking their time and thinking deeply about the implications of removing the GIL.
Python with the big bad GIL has become the most popular language in recent years. And it's gotten there slowly and steadily over 20 years. The most popular language just so happens to be the only (?) language that is single threaded.
I tend to pick up bad memes like crazy. One I picked up some decades ago is that FP has a useful property in this context: it preserves parallelism to remarkably late in the code, such that you can safely understand the impact of async applied to your code.
The lack of first class FP, beyond the type checking stuff, may actually at root be part of the problem: if Python had somehow become the FP rich language it said "nope, different model" -then parallelism and GIL would be a totally different argument.
Maybe I was mis-informed. FP interests me but I am not rich in experience. Maybe its an oversold idea?
I think list comprehensions (and generator expressions) already do (or did, in the past?) classify as FP.
In this context it seems a bit misleading to care about function call overhead (except for TCO) because Python is so slow overall. I'm usually seeing 30x when translating to C. (For pure Python of course, not PyPy or code making good use of numpy.)
Python has FP elements, such as the list comprehensions you mention, but overall I wouldn’t say it’s an FP language.
To continue that example, Python list comprehensions are not very powerful compared to that feature in FP languages, since Python is not expression orientated.
I would classify Python as imperative with FP and OOP elements.
Python is of course a slow language. My point is that writing in a functional style will give you slow code even by Python standards! Good FP languages assume you will write in that style and optimise for that use-case.
I genuinely think the Julia community should have guidelines on how to act on a public forum when advocating for Julia.
I've had so many instances of being rubbed the wrong way from Julia folk. In some ways, it is as annoying as the RIIR folk, but it just happens to be a smaller number of people.
I for one am excited to see the GIL-ectomy work come to the stage that it is today. I followed Larry Hasting's work back in 2016/2017, and I'm glad to see Sam has taken the mantle with Larry's blessing.
case in point here; guy just edits his comments down to a colon and disappears without a word.
the thing which makes this forgivable to me about this kind of astroturfing / shilling / etc is that the people doing it really do believe they are doing good when they are doing it and they don't believe for a moment that other people will see what they are doing for what it is.
many successful projects and businesses can directly attribute their success to choosing and using python, in contrast to your unsubstantiated opinions.
you are shouting into the void, and looking foolish while doing so.
One of the saddest things about popular open-source projects like Python is how inevitably the maintainers become worn down and jaded over time, due to constantly filtering and protecting Python from dumb or nonsensical ideas.
Then when something like this comes along, a true game changer, maintainers have no energy or goodwill left to collaborate on these infinitely important initiatives, like removing the dogdamn GIL. This is way more significant than everything I'm aware of that's happened since the birth of the language. All other things have been, trivial miuntiae in comparison.
It bears repeating: "everything else has been trivial minutiae in comparison to a GIL-ectamy for Python."
It cannot be understated, removing the GIL will be a HUGE deal for Python! It's an ugly wart which has existed for about 30 years, and nobody else has produced and delivered a working solution to the community.
I wish Team Python would welcome GIL Eradication with open arms and a supportive attitude. This would look like focusing on helping identify and implement solutions to the impediments rather than simply pointing out problems and then helicoptering away.