To communicate across mp you use pickle, which is a pretty significant performance impact relative to something like threading. There's also the issue of copy on write memory + reference counting interacting poorly.
I suspect this is the root cause of the difference in our experiences. My uses of mp have usually been somewhat more "embarrassingly parallel", for instance having a list of data elements which need to be processed with the same algorithm. For this use case, the usage of mp is pretty simple, often only a `pool.map(f, xs)`.
I can imagine that pickle might have tricky edge cases and/or be slow.
a) All reference counts after the fork are going to cause a copy of memory, so any memory access (even a read) can trigger a copy.
b) Even to send 'xs' over you must serialize it via pickle, and then the callee must deserialize it
That may be fine for your use case but it's strictly worse than just sharing a reference across a thread.