
Rayon: data paralellism in Rust - steveklabnik
http://smallcultfollowing.com/babysteps/blog/2015/12/18/rayon-data-parallelism-in-rust/
======
kzrdude
There's some rust magic in there, being that stack temporaries can be used
safely in Rust APIs like this, all due to the borrow checker.

------
arthursilva
Impressive, specially considering the `lack` of allocations and that the
compiler detects the bug on the parallel quick sort implementation.

------
craftkiller
This looks exciting, and I certainly plan to use it, but this style of
parallelism isn't ideal due to the constant join()s which will really take its
toll with Amdahl's law. This talk covers that and makes an argument for
composing parallel code using futures instead:
[https://www.youtube.com/watch?v=4OCUEgSNIAY](https://www.youtube.com/watch?v=4OCUEgSNIAY)

~~~
CyberDildonics
You are talking about two very different things: parallelism and concurrency.

Also the joins don't sync data together, they wait for all threads to finish.
That is not directly what Amdahl's law is about.

~~~
craftkiller
I disagree. Go to minute 45 in the talk for a visual of what I'm about to
describe: Hypothetically an ideal multithreaded program would light up as much
of the CPU as possible. If we viewed it as a timeline it would start at a
single point, expand as a bubble to fill the available CPU resources, and then
at the end contract back down.

Using this style of parallelism creates many bubbles that each condense down
to a single point every time you call join(), similar to a beaded necklace.
This means that the 'p' in Amdahl's law for the percentage of time spent in
parallel tasks is already stunted from the blocks of sequential code (or
necklace string) caused by the joins.

~~~
jerven
There is no need to call join more than once in a work stealing pipeline, join
is the terminal operation. In other words it rather similar to future.get(),
it is a synchronisation point; which yes is amdalhs law target.

Futures and workstealing parallel execution are two techniques that work hand
in hand.

Practically it can't be compared with C++ annotated OpenMP for loops, which
have an implicit join at the loop termination. In a parallel data flow system
like Rayon there are no implicit joins, only explicit as required. join() in
this system is more like return dataflow as show in the presentation you link
too.

~~~
craftkiller
In a literal sense, yes. But, and perhaps I am understanding this wrong, the
article in the join primitive section states: "Once they have both finished,
it will return". If you ignore the cost of spinning up and tearing down the
threads isn't this conceptually the same as the implicit join at the end of an
OpenMP loop?

~~~
jerven
In this specific case yes, but so is the await await as shown at
[https://youtu.be/4OCUEgSNIAY?t=3545](https://youtu.be/4OCUEgSNIAY?t=3545) in
your linked talk.

The nice thing is that join is recursive in the quicksort example and that
means its equivalent to the await await syntax in practical terms. Which also
means when both are finished it will return.

    
    
      let mid = partition(v);
          let (lo, hi) = v.split_at_mut(mid);
          Future:of(|| quick_sort::<J,T>(lo)).await(),
                  Future:of(|| quick_sort::<J,T>(hi)).await());
    

Is exactly the same in parallelism as this

    
    
        let mid = partition(v);
        let (lo, hi) = v.split_at_mut(mid);
        J::join(|| quick_sort::<J,T>(lo),
                || quick_sort::<J,T>(hi));
    

Except that join gives better scheduling due to work stealing which will avoid
unbalanced cpu usage.

My Rust is non existent but conceptually Rayon is similar to java9 parallel
streams which I know well.

