

BrainFuck inspired scheduler successfully replaced the Python GIL - dryicerx
http://bugs.python.org/issue7946#msg101612

======
janzer
To clarify a little, this patch does not eliminate the GIL it just schedules
the next thread to acquire the GIL using the BFS scheduling algorithm.

Also and perhaps more importantly this has not been incorporated into any
version of python. It is just a patch on the bug tracker and realistically I
doubt it has much chance of being accepted.

~~~
kinghajj
If he's fixed the bugs and gotten it to be portable (at least to POSIX), then
why shouldn't it get adopted? Just look at those benchmark results! 259ms per
loop instead of the next best of 1.25s. If a real app gets improvements from
this, it should be a no-brainer.

~~~
sid0
Glancing at the patch, it looks like it should work on both POSIX and Win32.

~~~
JoachimSchipper
At least the patch as first proposed suffered from all-the-world-is-Linux (see
the discussion of CLOCK_THREAD_CPUTIME_ID)...

~~~
rbanffy
You have to start with one platform. You then make it work on the others.

If I write something that initially only runs on IRIX and then make it work on
other OSs, it would be unfair to attribute it an all-the-world-is-IRIX. It
just happens the guy had a Linux box to refine his idea to the point of
working, unfortunately using stuff other OSs don't offer in the same way.

------
ash
The title is wrong. Brain Fuck scheduler is not related to brainfuck language.

------
mahmud
_The scheduler is a simplified implementation of the recent kernel Brain F_ _k
Scheduler by the Linux hacker Con Kolivas_

Not as fun now is it? Kolivas is a leader on scheduling, he can attribute his
hacks to whatever joke language out there and that wont make them any less
stellar.

~~~
dschobel
I had totally missed Con's return from self-imposed exile. Very exciting. I
loved his previous scheduler work. Glad to see him back at it.

------
viraptor
As far as I understand, it just changes the way thread scheduling works, but
doesn't make Python "properly multithreaded". That means it's still only one
active non-native-extension thread running at any time. Could someone confirm
it?

Edit: I guess janzer confirmed this at posting at the same time.

~~~
jey
No, Python doesn't work like that even with the traditional GIL. The problem
is that even when you have multiple OS threads, they all end up competing for
the same lock which kills throughput (but not as badly as the fully-serialized
scenario you described). By using a scheduler the locking order can at least
be controlled a bit more to improve throughput. [As far as I know; been a few
years since I dug through CPython.]

~~~
viraptor
What do you mean by "not as badly as the fully-serialized scenario"? I thought
that Python threads are fully serialized, apart from extensions code which can
spawn their own threads and release the GIL during operations that don't
affect the python memory (IO mainly). Interpreter still switches Python
threads using GIL, but the Python code itself never runs in parallel.

Are we talking about the same thing, or is there some other non-serialized
scenario?

~~~
jey
I'm pretty sure that there's a good chunk of stuff you can do in Python
without acquiring the GIL -- the problem is that in practice you end up doing
a lot of I/O and stuff that requires at least momentarily acquiring the GIL,
leading to contention. So if you stuck to the operations that didn't require
locking any GIL-protected data, you could run at full throughput. It's at
least not the case that the GIL is held _all_ the time while running a Python
thread -- the problem is instead that your threads end up having to acquire it
_often_.

~~~
JoachimSchipper
Actually, the GIL is needed to execute Python code (well, access Python
objects). It is released by I/O- or computation-heavy C code, so e.g. SciPy or
reading files allows some level of parallelism, but pure-Python code will be
serial.

~~~
jey
I stand corrected. And frightened. fork(), here I come!

~~~
cma
fork() isn't that great for a lot of situations. If you are thinking of taking
advantage of your operating system's copy-on-write paging by loading a large
chunk of data to be used read-only, forking a bunch of processes, processing
the data each processes, and finally, 'reducing' the results of all of the
forks into some sort of output, don't bother.

What happens is when you read an object in one process, python increments the
reference count, thus touching the memory page, thus copying it, thus screwing
you.

(however, compacting garbage collection turns out to have more or less the
same problem)

------
Snark7
This is related to Python 3.2 only. In other words, this is not noteworthy.

~~~
wisty
Python is python. I wouldn't be surprised to see it ported to python 2.7 if it
does work. At the moment I doubt it's production ready - lots of testing and
validation before it goes live. I thing Guido explicitly said that removing
the GIL is the sort of thing he would like to see in Python 2.X.

~~~
apgwoz
Actually, though I'm using 2.X in everything I'm doing, I'd rather see it
_only_ appear in 3.X. There has to be something that drives people to port
stuff to 3.X or it's not going to happen. Dramatic speed improvements such as
what this _potentially_ provides would be extremely helpful in that regard.
The other hope right now, is of course unladen swallow, which hasn't proved to
be very significant yet, as far as I'm concerned.

~~~
sapphirecat
3.X can be pretty awesome, but as long as projects want to maintain
compatibility with 2.5 or earlier, it's going to be difficult to get some
serious porting momentum going. Once 2.6+ becomes a practical development
target, 3.x will be a much easier sell.

At least, that's my perspective after watching PHP 5 slowly catch on amongst
PHPers, even though it had many more improvements (e.g. objects are no longer
value types), and far fewer compatibility breaks.

~~~
jrockway
People are lazy. I still hear people wanting Perl 5.8 compatibility for my
modules, even though 5.10 is 2 years old and has 100% backwards and forwards
compatibility with Perl 5.8. In other words, all your existing code will run
unmodified, and any 5.10-specific features you use will cause 5.8 to die at
compile time.

People confuse me.

~~~
astine
I use a lot of Perl 5.8 at the LoC because it's the only dynamic language that
comes installed by default on Solaris 10. That, and because it is the primary
language of a proprietary product that we have to use. I'd love to use Perl
5.10, but then I'd have to install it on all of the machines on which my code
is expected to run. If I had that kind of control, I'd skip Perl and go
straight to Python or Ruby. (Actually, that was a lie, I'd use Common Lisp if
I could.) As it is, I've standardized on Perl 5.8.

~~~
jrockway
They can install your product, but not if you bundle Perl/Python/Ruby in that
directory?

~~~
astine
The product I am talking about is Signiant:<http://www.signiant.com/>. It's a
file based workflow application that basically that is written in Perl in the
same sense that emacs is written in emacs lisp. The idea is for people to
write workflows in the embedded Perl environment which is the same across of
the machines on which Signiant is installed. I _could_ use another
interpretor, but that would require extra work and I wouldn't be able to use a
lot of the Signiant specific code.

