

A disciplined shared-memory model for parallel state-driven algorithms - s1m0n
http://simonhf.wordpress.com/2011/02/27/libsxe-shared-memory-and-parallel-state-driven-algorithms/

======
liuliu
Maybe you can have an implementation of shared-memory model that is
"predictable" (transactional). But I failed to see how that solves the
synchronization problem shared-memory model exposed to at large-scale system.

On the other hand, GPGPU community has an interesting idea to this problem.
The idea is to have a memory hierarchy and at the bottom of the level, you
have a homogeneous continuous memory space which is painfully slow and have
strong consistency. When you walk up the hierarchy, you will have faster,
smaller, local memory that have weaker consistency.

Disclaimer: I haven't got time around to read the paper throughly, it is just
my two cents.

