
What's so great about Reducers? - guilespi
http://blog.guillermowinkler.com/blog/2013/12/01/whats-so-great-about-reducers/
======
tel
This would be a fun time to talk about the power of `Monoid`s.

`Monoid` is a popular Haskell typeclass because the algebraic model for it,
the monoid, is simple, pervasive, and powerful. A monoid is composed of three
things: a type, a combiner, and a null value.

    
    
        Monoid = (T, <>, e)
    

And it must follow the laws that the null value (`e`) is the zero of the
combiner (`<>`) on both sides and that the combiner is associative.

    
    
        e <> x == x
        x <> e == x
        x <> y <> z == x <> (y <> z) == (x <> y) <> z
    

Examples include (Integer, +, 0) or (Integer, *, 1) or (List, append, []).

The trick is that if you are trying to fold up a law-abiding Monoid of any
kind then you know two things (a) you have a default value which you can
insert wherever you like to no effect and (b) you can reassociate your
expression (rotate your evaluation tree) to be any shape you like at all.

    
    
        ((((1 + 1) + 1) + 1 ) + 1)     -- left
        ==
        (1 + (1 + (1 + (1 + 1))))      -- right
        ==
        ((1 + 1) + (1 + 1) + (1 + 0))  -- parallel
    

By encoding these laws into the type system, you provide information to the
compiler which it could theoretically use to rearrange and optimize your
computations on the fly. To my knowledge, this isn't done anywhere yet, but
knowledge of those laws does allow you to confidently build efficient parallel
algorithms. For instance, we know that a dictionary full of monoidal values is
itself a monoid.

    
    
        instance Monoid v => Monoid (Map k v)
    

We take the empty dictionary as `e` and let our combiner be the union on keys
where values are combined using their own monoidal combiner. This forms a
really nice basis for a large class of map-reduce algorithms where the mapping
step results in some kind of dictionary-like summarization of values.

------
juliangamble
One of the really powerful things about functional programming is Higher Order
Functions, like Map, Reduce, Filter and ZipWith. This style of programming is
inherently scalable. This is seen in Google naming its distributed processing
framework MapReduce.

You can see a presentation about Clojure Reducers by Leonardo Borges here:
[http://www.slideshare.net/borgesleonardo/clojure-reducers-
cl...](http://www.slideshare.net/borgesleonardo/clojure-reducers-cljsyd-
aug-2012)

The application of these ideas is used in Big Data scenarios. In Clojure we
can leverage frameworks that use Hadoop to apply ideas about Higher Order
Functions:

* Cascalog [http://cascalog.org/](http://cascalog.org/) (Expressive queries in datalog over the Cascading library)

* Parkour [https://github.com/damballa/parkour/](https://github.com/damballa/parkour/) (Hadoop MapReduce in idiomatic Clojure)

------
lazyjones
I really don't want to write this low-level code, or decide whether parallel
or sequential implementations work best. Compilers targeted at particular
environments can do this better.

My main takeaway from this is that most programming languages still fail at
allowing the programmer to specify _intent_ where a particular implementation
is not desired - I cannot use the mathematical sum notation for the
accumulator example and let the compiler choose the best implementation,
instead I need to implement a particular array traversal or partitioning. And
in situations where high level language constructs would allow a certain
flexibility w.r.t. implementations and parallelism, usually this isn't taken
advantage of by the compiler/interpreter and programmers even rely on that
fact (e.g. map in Perl could work on list elements in any order or in
parallel, but all hell would break loose if it actually did at some point
because people are now used to sequential processing).

~~~
moomin
Actually, I think what you're really asking for is the ability to write
sequential code and have the compiler work out if it could be parallel. It's
true that the reducers implementation runs in parallel, but the API is
actually agnostic.

Reducers also provide higher level constructs like filter, map, cat
(concatenate), take-while, and of course reduce.

~~~
lazyjones
> _what you 're really asking for is the ability to write sequential code and
> have the compiler work out if it could be parallel_

Not really. I don't care how I have to specify the desired computation, as
long as it isn't excessively verbose and doesn't require me to nail down
implementation details I don't care about - for example, a for-loop (from 1 to
1 million as in the accumulator example) specifies a particular traversal
(index 1 to 1 million in ascending order), but for an implementation of "sum"
I only need "for each element" (no particular order) semantics. I don't want
to even think about whether by using such an (overly specific) loop, I'm
making the code difficult or impossible to parallelize (in the presence of
some side effects, for example in C, it will likely be).

~~~
moomin
Actually, no, for sum you need three things: how to start (0), how to
accumulate (plus) and how to combine two reductions (plus again). This is
exactly what fold doe

Equally, for filter, it's empty list, conj or not and concat. (Just to
demonstrate that sometimes the combine is different from the accumulate
stage.)

------
kazagistar
What is good about insisting that the two combine functions be binary? And
what is good about forcing a specific method of forking (evenly sized chunks)?

The more fundamental idea seems to be to use three steps: a method which takes
a collection into sublists (fork), a method which takes any number of
parameters and reduces them in a single thread (reduce), and a method which
takes some number of reducer results and combines then (join). For something
like string concatenation, non-binary can help avoid the need for an
intermediate linked list. For something like matrix multiplication, or
something that might require spacial partitioning, explicit forking is really
important, as size and order can have have a major performance impact
respectively.

Then again, this is a decent default approximation for many cases.

------
goldenkey
Relevant: [http://developer.amd.com/resources/documentation-
articles/ar...](http://developer.amd.com/resources/documentation-
articles/articles-whitepapers/opencl-optimization-case-study-simple-
reductions/)

