
Of Algebirds, Monoids, Monads, and Other Bestiary for Large-Scale Data (2013) - adamnemecek
http://www.michael-noll.com/blog/2013/12/02/twitter-algebird-monoid-monad-for-large-scala-data-analytics/
======
mikekchar
I find that monads are easier to explain in the context of programming that in
category theory.

First, start with a functor. In terms of programming, a functor is container
with an associated function (often called map or fmap -- I'll call it map
since fmap simply stands for "functor map"). map takes the contents out of the
container, applies a function (which you pass to map) and puts the results
back into the container again.

It's important to understand that the function may change the type, so while
you have the same kind of container, the type of data inside may change. For
example, imagine your functor (container) is an array of integers and your
function takes an integer and returns a char. The result of running map on
your functor is an array of chars. This may seem trivial, but it is important
later.

One other thing to realise is that anything that can "hold on" to a value and
for which you can implement map is a functor. So arrays, lists, tuples,
hashes/dictionaries/objects etc are all examples of functors. Even a function
that has a closure over a parameter can be a functor -- as long as you have a
way of implementing map (left as an exercise for the reader).

Monoids are usually not explicitly identified in most programming languages.
You can make functors that can contain anything (i.e. the set that represents
the values it can contain is allowed to be empty). So basically, you can
contain a functor that contains only nothing. A monoid is a functor in which
the set that represents the values it can contain can not be empty. In other
words, it must be able to contain a value. Also it must have an "identity"
value for a given operation. The identity value is one for which when you
perform the operation, you get the same value back. For addition and integers
the identity value is 0 (n + 0 = n). For multiplication and integers, the
identity value is 1 (n * 1 = n). For concatenation and strings, the identity
value is "" (s.concat("") = s). Most functors (remember, just a fancy word for
container in the context or programming) are monoids for a given operation.

An endofunctor is just a functor (container that you can implement map on) for
which the type you start with is exactly the same as the type you end up with.
For example, if you have an array of integers, apply a function to the
contents with map, and you end up with an array of integers again, then you
have an endofunctor ("endo" just means that it's the same on both ends). If
you ended up with an array of strings, then it's not an endofunctor because an
array of integers is different than an array of strings, even though they are
both arrays.

A monad is just a monoid (container that can hold at least 1 element and for
which there is an identity element for the function you will be applying) in
the category of endofunctors (you'll get exactly the same type after you apply
the function).

Usually instead of map (which automatically puts the transformed values into
the container) you use a function called bind with monads. bind works pretty
much the same as map, except that the function you pass to bind needs to put
the transformed value into the container. You would think that this is a PITA,
but it's very necessary in many situations (easiest way to see this that I
know of is to try to implement an "either monad" with only map -- you'll see
right away that it's not possible).

And that's it, really. I find it really interesting to understand how category
theory works, but it is not at all necessary for understanding how to use
functors and monads while programming.

Edit: I forgot the most important part! Why do a want a monad? Since it always
returns the same type from bind as what you started with, it means you can
chain functions. I've found that it's also really helpful in non-type checking
languages for reasoning about the type of things. If you are using bind on a
monad, you _know_ that chaining will work every time -- you never need to
check for null, etc.

~~~
mlevental
>For example, if you have an array of integers, apply a function to the
contents with map, and you end up with an array of integers again, then you
have an endofunctor

sorry i'm confused. been learning haskell in parallel with cat theory. isn't
the only category in haskell Hask? whose objects are types? in that sense
aren't all of the `Functor`s in haskell endofunctors?

~~~
dwohnitmok
There are many many categories in Haskell. All you need for a category is to
identify _something_ as the objects, _something_ as the arrows, and make sure
your identifications result in something that matches the category laws.

As a simple example, every monoid is a category with a single object (hence
the mono- prefix). Hence every monoid in Haskell is a category.

Now it is true that all `Functor`s in the sense of the Haskell typeclass are
endofunctors in category theory, because a higher-kinded type can be thought
of as a mapping between types in Haskell (e.g. `Maybe` maps `Int` to the new
type `Maybe Int`) and `fmap` can be thought of as a mapping of functions
between types to their remapped types (we take `a -> b` and replace it with
`Maybe a -> Maybe b`), which results in a functor from Hask to Hask.

However, it is not true that the only functors (in the category theory sense)
are instances of the `Functor` typeclass.

------
still_grokking
Very nice blog post.

When we're at it, have a look at "Functors, Applicatives, And Monads In
Pictures"[1]. No silly comparisons to burritos or something like that, only a
few pictures that everybody can easily remember.

[1]
[http://adit.io/posts/2013-04-17-functors,_applicatives,_and_...](http://adit.io/posts/2013-04-17-functors,_applicatives,_and_monads_in_pictures.html)

------
Stwerner
Shot in the dark here, but I vaguely remember a blog post that used birds in
an explanation for monads and I can't seem to find it again. It had hand-drawn
birds, and I think it started out with the identity monad, with the bird
representing it only able to speak their own name.

Does anyone remember that post? I'd love to find it again, I got really
excited when I saw the title, thinking that it had finally come around again.

~~~
pm-mk
Are you referring to the bird songs from Raymond Smullyan's "To Mock a
Mockingbird?"

I found this breakdown after a cursory search
[http://dkeenan.com/Lambda/](http://dkeenan.com/Lambda/)

~~~
Stwerner
Ah you know what, I think I am. I guess I was stuck thinking about Monads
rather than the lambda calculus. Thank you so much!

