Hacker News new | past | comments | ask | show | jobs | submit login

Since python3 is not backwards compatible with python2, why didn't the python devs leverage the opportunity for creating a more performant non-GIL runtime for python3?



> more performant non-GIL runtime for python3

Because making one of those those is a huge amount of work, may introduce more serious backwards incompatibilities (like C extensions), and not everyone has the knowledge of how to do it so they'd need to either learn from scratch themselves or find people interested in doing it for them.


Removing the GIL is non-trivial. But I'm surprised that they didn't take advantage of Python 3 to move to a more performant register-based interpreter. Several people expressed interest in building one, and working prototypes were even made, but the core developers didn't seem much interested in performance upgrades.

I think their attitude is generally, "if it's performance-critical, write it in C", which was a good approach 15 years ago. However, Python's competition now comes from languages like Go and Rust that have both good performance and are user-friendly and productive (with features such as expressive syntax and a comprehensive standard library).


Removing the GIL is a difficult problem: http://python-notes.curiousefficiency.org/en/latest/python3/...


I think the question was if you aren't going to be backwards compatible, why not unshackle yourself completely and design a new language without the GIL and make it as pythonic as possible?

A scripting language that support multi-threading is possible, right? I think TCL does it.


I guess they were happy with breaking backwards compatibility a little bit, not not as much as simply removing the GIL and not adding back any implicit synchronisation at all.


I think this question can only be raised in hindsight. I think nobody suspected the unicode transition to be so slow and painful. As there probably won't be such a big transition evermore, I wouldn't hold my breath for it happening.


It's a shame the GIL won't be removed because it's perceived to be too difficult.

It's trivial to remove the GIL - probably a week of mechanical work. Just don't depend on global variables in the interpreter. No global state, no problem. But the Python C API has to be changed to always store interpreter state into a struct, and a pointer to that interpreter state struct has to be passed as the first argument in all C API calls. Not rocket science; this is what Lua does.

It's a political decision to keep the GIL, not a technical one. As for preserving C API backwards compatibility, it's a straw man argument - the Python API broke from 2.x to 3.x anyway. There's no such thing as "lesser breakage" - only breakage.


Supporting multiple interpreters in a single process is not what most people mean when they talk about removing the GIL. Objects from one interpreter could not be used safely from another interpreter. This would be handy in a few situations but is essentially not all that different to multiprocessing.

What people usually mean when they talk about removing the GIL is having multi-threaded code make use of multiple cpu cores (as it does in Jython.) This would involved splitting the GIL into more fine-grained locks. Unfortunately experiments taking this approach have so far shown significant performance impacts for single threaded code.


It would be inefficient to share data structures or interpreter internal state across threads. This would lead to the poor performance you described with mutex contention.

It is better to have an interpreter instance per thread, each having their own separate variable pool, and pass messages between the different interpreter instances. This model scales well with multicore systems and is faster than the multi-process equivalent.

It's also a simpler model to implement and maintain. Python's current multithreading model is too complicated from an implementation point of view.


Because the runtime with the GIL is more performant and python users know that it's practically impossible to write correct software using threads.


Yes, this is the point. The GIL matters only if you are using threads with CPU-bound tasks (most of my work is I/O bound). When I have CPU-bound problems that need parallelism and are not yet well implemented in numpy/scipy, I would ratter use AMQP instead of threads.


Is there a way to do it? My understanding always was that removing GIL means that the code that runs single-threaded (which is like 95% of Python code out there) will get a performance hit.


Yes it does mean that and I believe it is unavoidable, because you need to replace the coarse-grained GIL with a fine-grained locking instructions. However, I think the trade off is now worth it, since multi-core CPUs are ubiquitous. Besides, Python itself was never notable for its performance to begin with, so I've always found the argument very odd.


It is perfectly avoided by not removing the GIL. :-) Multi-core CPUs may be ubiquitous, but multi-threaded Python code is not, by far. So performance of say, my Python code, would decrease, thank you, I don't need that. It's not that Python would have superb performance, I just don't see the reason to pay for something that's not actually needed.

It's also often forgotten that one of the easiest paths to parallelism is to just run multiple processes (and for many problems you only need to split the input data, the processes don't need to even communicate), and this solution naturally uses multiple CPUs without any GIL worries.

I think the actual use cases that require GIL removal are very little (something like a server, perhaps), and if you actually need to do that, I think you're better served with some JVM language or Go.


What I'm saying is that not only does Python not have superb performance, its performance is actually largely irrelevant. The advantage of Python is in ease of use, capabilities as glue language, readability, etc. So that's why I really don't see why you wouldn't remove the GIL at a reasonable slowdown factor (2x was a figure thrown around IIRC).

People always proclaim that you can just use multiple processes for parallelism. It's nice when it suits your (embarrassingly parallel) problem but when you need to share a large amount of data between processes it's a major hassle.


> but multi-threaded Python code is not, by far.

Eh, isn't that because of the GIL?


You may be right. But it's 2 levels of comments and I am still waiting for some examples of use case. Any use case I can think of can be better served with Java or C++ or C.

In fact, as I understand it, you can actually call multi-threaded applications from Python. You can only execute Python code (in the same interpreter) single-threaded.

Although, an embedded scripting language (say, in a game) could in theory benefit. But isn't Python already too big for that kind of applications?


A common use case is GUI, where you offload heavy computation into a separate thread to avoid blocking the main thread. I once saw a GitHub project of a Python GUI abandoned where the README said he/she gave it up because the lack of threading results in unresponsive GUI all the time.


Because that's difficult to do. But the PyPy team has already done it for Python2 code. Check into PyPy-STM.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: