I'm enthusiastic about SharedArrayBuffer because, unlike threads in traditional languages like C++ or Java, we have two separate sets of tools for two very separate jobs: workers and shared memory for _parallelism_, and async functions and promises for _concurrency_.
Not to put too fine a point on it, shared memory primitives are critical building blocks for unlocking some of the highest performance use cases of the Web platform, particularly for making full use of multicore and hyperthreaded hardware. There's real power the Web has so far left on the table, and it's got the capacity to unleash all sorts of new classes of applications.
The async culture of JS is strong and I don't see it being threatened by a low-level API for shared binary data. But I do see it being a primitive that the JS ecosystem can use to experiment with parallel programming models.
But on closer inspection of the post, this implementation seems to be highly targeted at certain kinds of compute-bound tasks, with just the shared byte array based memory. It's well-partitioned from the trad ui / network event processing system in a way that makes me optimistic about the language.
1. How is the accidental modification of random JS objects from multiple threads prevented - that is, how is the communication restricted to explicitly shared memory? Is it done by using OS process underneath?
2. Exposing atomics greatly diminishes the effectiveness of automated race detection tools. Is there a specific rationale for not exposing an interface along the lines of Cilk instead - say, a parallel for loop and a parallel function call that can be waited for? The mandelbrot example looks like it could be handled just fine (meaning, just as efficiently and with a bit less code) with a parallel for loop with what OpenMP calls a dynamic scheduling policy (so an atomic counter hidden in its guts.)
There do exist tasks which can be handled more efficiently using raw atomics than using a Cilk-like interface, but in my experience they are the exception rather than the rule; on the other hand parallelism bugs are the rule rather than the exception, and so effective automated debugging tools are a godsend.
Cilk comes with great race detection tools and these can be developed for any system with a similar interface; the thing enabling this is that a Cilk program's task dependency graph is a fork-join graph, whereas with atomics it's a generic DAG and the number of task orderings an automated debugging tool has to try with a DAG is potentially very large, whereas with a fork-join graph it's always just two orderings. I wrote about it here http://yosefk.com/blog/checkedthreads-bug-free-shared-memory... - my point though isn't to plug my own Cilk knock-off that I present in that post but to elaborate on the benefits of a Cilk-like interface relatively to raw atomics.
(2) sounds like it might need a larger set of primitives, though I'm not sure.