

Python GIL removal question - spahl
http://mail.python.org/pipermail/python-dev/2011-August/112813.html

======
runT1ME
>That is, if one processor writes to memory in a cache-line shared by another
processor, they must stop whatever they are doing to synchronize the dirty
cache lines with RAM. Thus, updating reference counts would flood the memory
bus with traffic and be much worse than the GIL.

I dont' understand. Isn't this going to happen if you have multiple threads
running even _if_ the GIL is blocking them from running? I'm not a hardware
expert, but I'm not sure how constant locking would prevent cache
synchronization just because they weren't truly running in parallel.

I am fairly certain that constant synchronization(lock) because of the GIL
would negatively impact cache performance, especially since well designed
multithreaded applications avoid locking for as long as possible.

~~~
meastham
I believe his argument is that it would reduce thrashing between the caches.
With the GIL ownership of a cache line containing the reference count for any
given object will only have to be transfered at most once per timeslice. If
multiple threads were concurrently accessing a python object it would be ping-
ponging back and forth between caches much more frequently.

EDIT: Also, "stop whatever they are doing to synchronize the dirty cache lines
with RAM," is not a very good way to describe what is going on, often times
you don't have to hit RAM at all, the caches just synchronize between each
other. It is still pretty bad for performance though.

~~~
runT1ME
>I believe his argument is that it would reduce thrashing between the caches.
With the GIL ownership of a cache line containing the reference count for any
given object will only have to be transfered at most once per timeslice.

Ah. Makes sense.

>just synchronize between each other

Yes, but that's bad because that cache line is 'stuck' for all processors
while the synchronization is occurring, if I'm not mistaken...

~~~
meastham
In general at least the two processors with the conflict will either have to
block for a bit or switch to another hardware thread when write conflicts are
occurring. There are lots of architecture tricks people pull to try to
mitigate the impact but the reality of the matter is frequently mutating
shared state (e.g. reference counts) makes it extremely difficult to have good
performance with threads running in parallel.

------
pnathan
I find it terribly annoying when the solution to a language problem is "Write
an extension". It's a cop-out.

Java, .NET, various Lisps, and I'm sure other systems that I don't know off
the top of my head have solved the problem of true threading.

~~~
wycats
Rubinius (in the current 2.0 betas) has removed the GIL as well. In fact, the
only Ruby that still has a GIL is MRI. MacRuby, JRuby and Rubinius (as of 2.0)
all have threading without a GIL.

~~~
pnathan
That's very exciting. Will MRI be fixed "soon" ?

~~~
mitchty
Just like CPython, its doubtful.

------
aklein
Can anyone shed light on where the GIL winds up hurting you?

According to this paper [1]: "Thus, in all cases, the single global lock
semantics seem fundamentally compatible with both lock-based and transactional
memory implementations."

[1]
[http://www.usenix.org/event/hotpar09/tech/full_papers/boehm/...](http://www.usenix.org/event/hotpar09/tech/full_papers/boehm/boehm_html/)

------
ulrich
If you leave aside C-extensions and libraries, CPython is a pretty bad
language for number crunching. This is just not the use-case it tries to
solve. I am glad they favor simplicity over execution speed.

Removing the GIL might be useful for a faster implementation with JIT compiler
though.

~~~
dman
CPython has numpy which rocks for number crunching.

~~~
rbanffy
And has no GIL problems.

~~~
cdavid
This is not exactly true. We can alleviate the GIL issue by releasing it
inside C extensions, true, but it is still there nonetheless. Incidentally,
one of the big advantage of the GIL is to make C extensions easier to write.

~~~
rbanffy
You can then call it The GIL Advantage ;-)

As long as you are not creating or destroying the Python objects you expect to
give back to the Python side of your program, you don't have to care much
about it.

------
br1
Reference counts could be stored separately from objects and migrated to the
thread that modified it. Or several reference counts for the same object could
be used. Has this been tried?

------
yason
_If the GIL bites you, it's most likely a warning that your program is badly
written, independent of the GIL issue._

Ah, that old line again.

Translation: _"We really don't like to even think about changing this crappy
design that we started with in the first place, because we can just explain
ourselves out of it by coming up with suitable language goals that don't
actually require concurrent access to the interpreter. Not accessing the
interpreter concurrently is one of our language goals because you can do_
everything else _. So, if you think you still need to get rid of GIL then
you're just a bad programmer and your programs are badly written because hey,
we just defined the universe you're playing in."_

~~~
MostAwesomeDude
Please explain how you would sidestep the GIL, please. We've been waiting for
a good solution for over a decade now. :3

------
kev009
Sounds like typical denial a la Firefox memory usage or MySQL's early lacking
that have to get eaten down the line.

Not that the case isn't well argued, but to claim that GIL isn't a fundamental
limitation and a bad thing is silly.

A few years from now it will be like, "Oh, yeah.. that".

~~~
lambda_cube
He's not saying that. He's saying that the GIL isn't a limitation to certain
kinds of application, the kinds that Python usually is used for. The kinds of
applications where the GIL would be a limitation, Python also has another
limitation: slow performance, and performance is usually the reason to run
things in parallel.

With PyPy the performance will get better, and they also have a GC, so that
hinder is removed. I don't really know if PyPy has a GIL, I would guess that
they don't.

~~~
Spyro7
"With PyPy the performance will get better, and they also have a GC, so that
hinder is removed. I don't really know if PyPy has a GIL, I would guess that
they don't."

PyPy still has GIL:

[http://codespeak.net/pypy/dist/pypy/doc/faq.html#does-
pypy-h...](http://codespeak.net/pypy/dist/pypy/doc/faq.html#does-pypy-have-a-
gil-why)

For more information about it, check these sources:

Official PyPy Status Blog - <http://morepypy.blogspot.com/2008/05/threads-and-
gcs.html>

Thinking about the GIL (read the whole thread, interesting stuff) -
[http://mail.python.org/pipermail/pypy-
dev/2011-March/006991....](http://mail.python.org/pipermail/pypy-
dev/2011-March/006991.html)

~~~
lambda_cube
Ok, the PyPy FAQ says: "Yes, PyPy has a GIL. Removing the GIL is very hard.
The first problem is that our garbage collectors are not re-entrant."

Is it really necessary for the GC to be re-entrant to run the interpreter in
parallel? Couldn't you have the interpreter running in parallel and then when
there is a need to run the GC you have a global GC lock that prevents all
threads from running - a stop the world GC. The application runs for a longer
time than the GC, right? So it would be a win and a step in the right
direction? I believe the early Java mark and sweep GC was like that, and then
later Sun developed several different kinds of concurrent and parallel GCs.

> Official PyPy Status Blog

Oh I read that every time they write something. :) But I started reading it in
late 2010 and I haven't gone back to the archives, I guess it's time to do
that. Thanks for the links.

------
thadeus_venture
The most frustrating thing about Python is its community's complete denial
about what a joke their concurrency situation is. Truth is python is not truly
multi-threaded, and no, claiming that multi-process is the way to do parallel
computation across the board is not a sane argument at all. It's religious
zeal. My company is currently using it for web apps, and that's proving a pain
(i.e. having to use proxy servers for database access to minimize connections
across all the python process instances). Using python for anything more
serious, like a message queuing system for example is even more prohibitive.
People in charge should wake up and start taking serious steps about it. I
guess PyPy is the biggest hope. Meanwhile in the JVM world..

~~~
dman
The most frustrating thing about the ferrari community is that they are in
complete denial about what a joke their affordable practical sedan story is.

~~~
thadeus_venture
Have you ever heard Guido speak about the issue? He and a number of others
don't think it's one worth solving. Really.

Yea it may be a lot of work to create a new GC implementation and change the
threading model, but if you want the language to progress that's the way
forward.

~~~
irahul
> Have you ever heard Guido speak about the issue?

Yes.

> He and a number of others don't think it's one worth solving. Really.

Check your claims.

<http://www.artima.com/weblogs/viewpost.jsp?thread=214235>
[http://docs.python.org/faq/library#can-t-we-get-rid-of-
the-g...](http://docs.python.org/faq/library#can-t-we-get-rid-of-the-global-
interpreter-lock)

