
Parallel Python - federicoponzi
http://www.parallelpython.com/
======
mangecoeur
There are much more active projects that do the same thing as this.

\- IPython parallel
([https://ipyparallel.readthedocs.io/en/latest/](https://ipyparallel.readthedocs.io/en/latest/)
) is pretty much the same idea but under active development, supports both
local processes, small clusters, cloud computng, an HPC environments (SGE
style supercomputing schedulers).

\- Joblib is a great tool for common embarrassingly parallel problems - run on
many cores with a one liner
([https://pythonhosted.org/joblib/index.html](https://pythonhosted.org/joblib/index.html))

\- Dask provides a graph-based approach to lasily building up task graphs and
finding an optimal way to run them in parallel
([http://dask.pydata.org/](http://dask.pydata.org/))

~~~
gshulegaard
Slightly dated now, but a good walk through of the "steps" of parallel
computing in Python:

[https://www.youtube.com/watch?v=gVBLF0ohcrE](https://www.youtube.com/watch?v=gVBLF0ohcrE)

------
lambdasue
By my estimate, this projects seems to be old / stalled, not in active
development. This notion is based on the lack of Python 3 support and the fact
that majority of the activity in the forums happened in years 2006 - 2010.

EDIT: I was wrong, there is some activity in a different branch of the forum
[1]. Last release almost two years ago.

[1]
[http://www.parallelpython.com/component/option,com_smf/Itemi...](http://www.parallelpython.com/component/option,com_smf/Itemid,29/board,2.0)

~~~
lima
Python 3 support is a great indicator for that sort of thing.

When I ported my stuff, I discovered that some of the libraries I used were
unmaintained and I replaced them by a newer library.

------
mkesper
From the FAQ:

Q) What Python versions are supported?

A) PP was tested with Python 2.3 - 2.7 on a variety of platforms.

~~~
gigatexal
Then that's a hard pass for me.

------
QuadrupleA
Most fellow python people know this already, but native compiled C/C++ code
generally runs about 100x faster than interpreted CPython code for any kind of
number crunching or computation. So like sonium also mentioned, definitely
consider the C route also if you're at all familiar with it and you're having
performance problems. Threading python wouldn't generally get you as far
(perhaps 4-6x speedup on an 8 core machine). Python's C integration is very
well thought out and lets you use the best of both languages, as well as gain
a deep understanding of what's going on under the hood in your python
programs.

~~~
piker
Forgive my ignorance, but how are you getting 4-6x speedup with threading in
the face of the GIL? Asking because I generally choose C/C++ when performance
matters over Python because of the GIL (and somewhat, the GC).

~~~
brianwawok
Well you can multiproccess. 8 core machine? Run 8 pythons with 8 Gils. What
web servers have done forever.

~~~
piker
True, but OP said threading. Threading and multiprocessing imply significantly
different engineering challenges.

------
sonium
I'm a daily user of Python but the parallelization efforts really make me
wonder. When things in Python are slow, its most likely something with native
Python code (in contrast to e.g. Numpy). Wouldn't it then make more sense to
rewrite this in native C code which gives a speedup of two orders of magnitude
instead of getting into the mess of paralellization which scales, in the best
case, with number of cores?

~~~
Longhanks
C can be a very difficult beast to tame. Using parallel python interpreters
may actually be easier for some than giving up on OOP, garbage collection,
platform independence, maintenance of a whole other language stack and bugs
you just won't encounter with Python.

Not saying it's a bad idea, but it's a totally different approach.

------
Mayzie
_sigh_ Why do all of these Python reimplementation projects target only Python
2.x?

~~~
afarrell
Because in April 2014, python 2.7 was effectively declared to be an LTS
release with support until 2020. No LTS release has been declared for python
3.x

~~~
fnj
The other way of looking at it is that 2.7 is EOL in 2020, and there will be
no further 2.x. 2.x is very much a dead end.

[http://legacy.python.org/dev/peps/pep-0373/](http://legacy.python.org/dev/peps/pep-0373/)

~~~
BuuQu9hu
The PyPy team will support Python 2 indefinitely:
[https://morepypy.blogspot.com/2016/08/pypy-gets-funding-
from...](https://morepypy.blogspot.com/2016/08/pypy-gets-funding-from-mozilla-
for.html)

We never have to leave Python 2, if we do not want to.

~~~
no_wizard
This comes up over and over again in the Python community.

If you will allow me to paint with broad strokes, as I'm going to lump a lot
of things together for the sake of ease of understanding, I am not speaking
for every python user(s) out there. There seems to be a huge tug of war
between companies/corporate users and the PSF & more independent users.

The smaller (if you will, these are going to be big in the python community)
'shops' and library developers, and users all are or have migrated to Python 3
ages ago. Django now is going to be python 3 only in the foreseeable future (I
know they EOL'd their python2 compatible libraries). The Software foundation
itself wants it to be essentially toast by 2020. Things like Pyramid, Flask,
Bottle, and even numpy & scipy have started moving in a direction of having
python3 be their 'first' update (I think in the case of Flask and Bottle,
they're end of life support for python2 is happening as well, though they
haven't announced anything official, they seem to be adding new features and
promising features that will only make sense in python3).

Yet, big 'corporate' users like Google (Look at you, grumpy
[https://github.com/google/grumpy](https://github.com/google/grumpy)) and
Apple (still shipping with 2.7...and why!) and large universities all seem to
be stuck on python 2.

Because of this, this to me really hurts the community and its holding back
full development of python3 and python generally going forward. Why are
companies holding back the state of a language (and then complaining about it)
instead of diverting their ya know, massive resources, to get their own
libraries and technologies up to current stack?

Why do so many people want to stay on python2. I'll never understand.

For what its worth, the latest version of python3 (3.6, well, 3.6.1, but the
release notes for this is for 3.6) haves a ton of forward thinking built in
concurrency improvements and technologies that these companies could easily
take advantage of!

[https://docs.python.org/3/library/concurrency.html](https://docs.python.org/3/library/concurrency.html)
[https://docs.python.org/3/library/asyncio.html](https://docs.python.org/3/library/asyncio.html)
[https://docs.python.org/3/library/multiprocessing.html](https://docs.python.org/3/library/multiprocessing.html)

~~~
jjawssd
Lack of ROI (return on investment) is why people stay on Python 2

~~~
no_wizard
Then why hold everyone else back if you aren't willing to push forward? At
least with new projects it'd be great if people used Python 3.

Google's choice with Grumpy particularly astounds me. Why 2.7? Its not clear
to me and in fact, its really unclear to me, why that was a 'better' choice.

~~~
dragonwriter
> Google's choice with Grumpy particularly astounds me. Why 2.7?

Probably because Google has a lot of legact Python 2.7, and doesn't do new
greenfield development in Python 3, preferring other languages (particularly
Go), so...

Projects made to scratch the maker's own itch are good in that they get
thoroughly dogfooded, but they also can carry a focus that reflects the
creator's needs more than anyone else's.

------
t0mk
I guess this is supposed to cure the GIL curse of threading in Python? i.e.
the fact that only one thread can be running in a n interpreter process.

There is still the multiprocessing module which can spawn processes, so you
can effectively run code in parallel. It's pain to manage though.

I think in the end I am grateful for no parallel threading in Python as it
forces me to either do things so that they can be run in naive-parallel, or to
use things that have the concurrency solved out of Python.

~~~
gnipgnip
Sadly you can only share pickleable objects in python multiprocessing.

~~~
radarsat1
Or shared memory: [http://stackoverflow.com/questions/5549190/is-shared-
readonl...](http://stackoverflow.com/questions/5549190/is-shared-readonly-
data-copied-to-different-processes-for-python-multiprocessing/5550156)

(but it's a little hacky..)

~~~
cocoablazing
Still, multiprocessing objects like Array and Manager are limited to ctypes.

~~~
sevensor
Sometimes worth it, but often not. The ability to do shared-memory multi-
threading is one of the things that tempts me away from Python. Message
passing is great and all, but sometimes you want your messages to be passing
around control of a shared 4GB data structure, instead of trying to copy it.

------
rev_d
Wouldn't it make a lot of sense to just use Pyspark with RDDs? Latency would
be relatively high, but it'd also bypass the GIL while also being more modern.

~~~
mangecoeur
In my experience pyspark is much more flaky and annoying that doing parallel
computing with more 'python native' tools. It only really makes sense when you
outgrown small clusters and really need huge infrastructure.

~~~
splike
What python tools do you use for small clusters?

~~~
elyase
Dask would be an option.

~~~
mangecoeur
Was going to say that. Or ipython parallel if you want to go lower level

------
Animats
It seems to be Python's "multiprogramming" module with some extra features.

Python's parallelism options are "share everything (but with only one thread
computing at a time)", or "share nothing" in a different process on the same
machine. If you're parallelizing to get extra crunch power, neither is
efficient.

~~~
dragonwriter
The base multiprocessing modules isn't shared nothing (though it doesn't share
the runtime, there are shared memory constructs that are available.)

It's better described as having only _explicit_ sharing (as opposed to classic
threading models, which have implicit sharing.)

------
gravypod
I wish someone could port Julia's parallelization notation to python. That's
the only Language that struck me as "wow".

------
ldev
Expected redirect to golang.com

------
m_mueller
Needs IPC - not interested due to comm. bottleneck. Anyone got experience with
Jython? How drop-in is it?

~~~
marktangotango
The same story as all the dynamic jvm languages; great so long as all your
dependencies are pure Python, interop with native code ie numpy is
nonexistent. JRuby was making some headway here but I haven't checked in a
while.

