Hacker News new | past | comments | ask | show | jobs | submit login

Thank you for the article.

I use multiprocessing and I am looking forward to the GIL removal.

I would really like library writers and parallelism experts to think on modelling computation in such a way that arbitrary programs - written in this notation - can be sped up without thinking about async or parallelism or low level synchronization primitives spreading throughout the codebase, increasing its cognitive load for everybody.

If you're doing business programming and you're using python Threads or Processes directly, I think we're operating against the wrong level of abstraction because our tools are not sufficiently abstract enough. (it's not your error, it's just not ideal where our industry is at)

I am not an expert but parallelism, coroutines, async is my hobby that I journal about all the time. I think a good approach to parallelism is to split you program into a tree dataflow and never synchronize. Shard everything.

If I have a single integer value that I want to scale throughput of updates to it by × hardware threads in my multicore and SMT CPU, I can split the integer by that number and apply updates in parallel. (You have £1000 in a bank account and 8 hardware threads you split the account into 8 bank accounts and each store £125, then you can serve 8 transactions simultaneously at a time) Then periodically, those threads can post their value to another buffer (ringbuffer) and then a thread that services that ringbuffer can sum them all for a global view. This provides an eventually consistent view of an integer without slowing down throughput.

Unfortunately multithreading becomes a distributed system and then you need consensus.

I am working on barriers inspired by bulk synchronous parallel where you have parallel phases and synchronization phases and an async pipeline syntax (see my previous HN comments for notes on this async syntax)

My goal would be that business logic can be parallelised without you needing to worry about synchronization.




At least in what I do, I find 80% of my parallelism needs covered by pool.map/pool.imap_unordered. Of the remaining 20%, 80% can mostly be solved by communicating through queues or channels (though admittedly this is smoother in Erlang or Rust than in Python).

Of course that's not true for everything, and depending on the domain tree dataflows can also be great. I remember them being very popular in GPGPU tasks because synchronization is very costly there.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: