

Easy parallel Python with concurrent.futures - edward
https://gist.github.com/mangecoeur/9540178

======
craigds
If you need this in python 2.7, you can get it from pypi (pip install futures)

------
eloff
Parallel python is almost an oxymoron. Python has a GIL that prevents more
than one thread from running Python code at the same time. You can get around
that by spawning multiple processes each running an independent interpreter,
but that's not as cheap or easy as spawning threads. You need to reuse those
processes for multiple units of work, or chunk your units of work into big
enough pieces to offset the startup costs. Sharing large amounts of memory
between the processes requires copying or shared memory (which also isn't that
friendly to use from Python.) Granted copying is harder to get wrong, so it's
the recommended solution anyway. If you wanted extreme performance Python is a
lousy choice in the first place.

~~~
wting
CPython the implementation has a GIL; Python the language does not. Plenty of
other implementations (PyPy, Jython, Cython) avoid having a GIL or have
options to disable it.

> If you wanted extreme performance Python is a lousy choice in the first
> place.

Using Python for glue code and C-modules for performance critical work is a
popular option.

~~~
eloff
I'm aware of that, but CPython is by far the most popular implementation. Add
IronPython to the list without a GIL as well.

However, Jython, IronPython and PyPy must all variously slow down even single-
threaded code without the GIL because Python having the GIL made lists, dicts,
and sets threadsafe. The locking (or complex lock-free algorithms or STM in
the case of PyPy) is not free.

The Python language is just a poor choice for concurrency. It's also a poor
choice for a glue language, as Lua is much better designed for that, but
Python seems to have become popular in that role anyway (likely due to a
richer ecosystem.) Python, even with all of PyPy's JIT magic is still one of
the slower languages. If you're using it for real work (not just as glue) and
you need extreme performance, then it's a poor choice in the first place.

Keep in mind that I have a Python day job, and Python has been my favorite
language for over a decade. I'm not hating on Python, I'm just acknowledging
its failings.

~~~
dr_zoidberg
Through Cython you can write extensions that release the GIL to make them
easily parellizable. However, given that I mostly work Python in Windows (and
64 bit version), I get that Cython might not be a viable option for your use
cases.

As for high performance Python, I always recommend Ian Ozsvalds talks at
various conferences. Python can be used to reach high performance, even work
well in distributed environments and very paralell architectures. It's owed to
the great librarys that were developed to work in those circumstances (Numpy,
numba, SciPy, OpenCV[1], to name a few), but it's still something posible
within the language. Add that to the rapid development cycle that is posible
within it, and the flexibility that it can achieve, and you've got a great
language for HPC -- despite it's shortcommings.

[1] I will acknowledge here that OpenCV is not Python-specific, if anyone
wishes to point that.

~~~
eloff
I'll grant you all of those points, if you use Cython to compile your code to
native or use a native library to do the expensive parts of your code, then it
works out fine. Both of those get around Python's inherit slowness by doing
the costly things outside of Python.

------
vegabook
_Another_ Python concurrent library?? To boot one which makes Python look like
Java? This.that.that.this.whatever()?

How is this better than a few simple lines of multiprocessing module exactly?

Yes I'm being negative. People need to _stop_ already adding yet more parallel
libraries to Python. Contribute to existing 15+ options or stay away. Unless
you're GvR (and even that's debatable) you're not helping the ecosystem -
you're confusing it even further.

Python is not a good first choice for multicore anyway because it is at least
10x slower than compiled languages so the best you are hoping for is 8-core
parallelism to get you back to the performance of _single core C_ (or whatever
compiled language you want). If you're serious about performance there is no
other way than to abandon Python. Only Numpy (and Pandas) are keeping it
competitive, and in a number of fields, it's looking more and more R-like as a
prototyping-only language, with production in anything-but.

~~~
necrobrit
concurrent.futures isn't really _another_ parallel library... its just the
standard libraries higher level interface to both the multiprocessing and
threading modules.

See the PEP:
[https://www.python.org/dev/peps/pep-3148/](https://www.python.org/dev/peps/pep-3148/)

