
Faster Parallel Python Without Python Multiprocessing - Cyranix
https://towardsdatascience.com/10x-faster-parallel-python-without-python-multiprocessing-e5017c93cce1
======
heyflyguy
I am a terrible python scriptwriter. Still, I use my own scripts that I have
written through trial and error and reading lots of stack overflow. I process
about 500 images per day and each one takes about 30 seconds each. Adrian
Rosenbrock has been a real lifesaver. I have a machine dedicated to this. I
tried using multiprocessing once and could not get it to work. Being able to
process in a parallel fashion would be a gamechanger for me.

The beauty of python for someone like me is that I can get my job done without
actually having to do it. I free up my own time to leverage more of my
creativity and have a multiplicative effect on my productivity.

The reason I bring all of this up is that so many of the examples for advanced
libraries I see are geared towards seasoned software engineers. The examples
include false arrays of data so that you can "just see" how to use it. I don't
think anyone realizes how confusing this is to the guy who is a restaurant
manager, or the gal who is a researcher that just needs to know how to make
this work for them.

If it's an image, how about putting image = cv2.imread("C:\image.jpg") or
whatever?

Anyway, a bit of a rant but there are people who are very thankful that smart
people in this world like yourself write libraries we can use to make our
daily lives better. Including example code that is stupid simple would make me
so much happier.

------
mike_mg
While I appreciate the efforts of authors and believe in long term mission,
they seem to not mention anywhere some key shortcomings of Ray, while
marketing it pretty hard (eg see the paper).

I have used ray (a year ago) in one of the advertised basic applications:
parallelising the environments for RL. It was unusable back then, as it was
clogging up the memory.

The plasma store which is backend for arrow was never cleaned which made the
computation stop after 3 hours

Here’s the issue:

[https://github.com/ray-project/ray/issues/2128](https://github.com/ray-
project/ray/issues/2128)

Or perhaps this has been fixed already?

------
htfy96
Every time I see a performance issue/solution of Python, I was wondering why
there is no company maintaining a distribution with a high-performance Python
JIT compiler with patched, GIL-free packages. Given the prevalence of Python
and what have succeeded in JVM, it seems like a fruitful business.

~~~
glandium
One problem is that because of the GIL, many python libraries don't use
locking on their data, etc., because they don't need to. That makes them
thread-unsafe without the GIL.

~~~
zaphirplane
You mean libraries with c extensions

~~~
glandium
No, even libraries without. Picking something random in the python 3.7
standard library: collections.OrderedDict.__setitem__ doesn't look thread-safe
when it updates its linked list. EDIT: well, in fact, the data race in that
one is there whether there is a GIL or not...

~~~
xapata
Just FYI, there are few remaining use cases for OrderedDict, now that built-in
dict keeps insertion order.

------
tech_tuna
I love Python but they best way to do do parallel Python is to use Go.

