
IO Monad Considered Harmful - ryantrinkle
http://blog.jle.im/entry/io-monad-considered-harmful
======
JoshTriplett
I've been writing Haskell code for years, and this is one of the most
singularly _useful_ articles I've read on it. I've certainly found the
proliferation of "monad tutorials" and "learn this so you can print something"
guides obnoxious, but it never occurred to me that IO really could be taught
completely separately from monads.

Here's the outline of what the middle of a tutorial could look like, following
this advice:

Here's how to print a string (putStr, putStrLn). Here's how to print something
other than a string (show, print). Here's how to print two somethings in a row
(>>). Here's how to read something and print that (>>=). Here's how to print
something in a function other than main (:: IO ()). Here's how to read
something in a function other than main (:: IO a, >>= again, return). Oh by
the way, it turns out that pattern (return, >>=) is so common that Haskell
gives it a special syntax (do notation). For instance, it works with the
previously introduced Maybe. And with List. Remember typeclasses? You can
write functions that work with IO, Maybe, _and_ List (Monad m => ...).

Even when actually talking about monads, the explanation should come first,
and the term should come later. "It turns out these operations are a common
enough pattern that Haskell gives them special syntax..." comes across much
more sanely than "now we're going to introduce something [complicated] called
a monad".

~~~
millstone
I think your suggestion is a relative improvement over most tutorials. But
it's still unmanageably hard.

"Here's how to read something and then print that: >>=". Sure, like `getLine
>>= putStrLn`. Technically speaking, that's correct. But this doesn't
generalize at all.

Maybe I want to greet the user by name. How do I read the name, prepend
"Hello, " to it, and then print that? This is a totally natural thing to want
to do, and is very straightforward in Java. But our Haskell newbie hits a wall
- none of your cases cover it!

Here's how I might write that today, without using do notation: `getLine >>=
(return . (++) "Hello, ") >>= putStrLn`. Look at that - it's totally nuts, and
I would not attempt to explain that to a Haskell neophyte.

Using do-notation, you can make things nicer:

    
    
        do
         name <- getLine
         putStrLn $ "Hello, " ++ name
    

but this introduces a bunch of new syntax to learn: the difference between
let-in and let and <-, how do-notation works, etc.

I don't have a better approach than what you suggest. I suspect that this
stuff simply can't be made easy.

~~~
JoshTriplett
> Here's how I might write that today, without using do notation: `getLine >>=
> (return . (++) "Hello, ") >>= putStrLn`.

Don't write point-free code here. Here's a fairly simple version:

    
    
        getLine >>= \name -> putStrLn ("Hello, " ++ name)
    

The only novel thing there is the >>= operator itself. You don't even have to
introduce return there; you can introduce return later.

~~~
rapala
Which can then be made to look exactly like the

    
    
      getLine >>= putStrLn
    

case by doing

    
    
      greet name = putStrLn ("Hello, " ++ name)
      main = getLine >>= greet
    

(Edited to include the definition of main)

~~~
millstone
If this is meant to be under main, you need a "let":

    
    
        main = do
          let greet name = putStrLn ("Hello, " ++ name)
          getLine >>= greet
    

For whatever reason, I didn't learn about the magic let syntax in do-notation
until relatively late. Anyways this is very elegant, but not something I could
have generated early on, especially because it mixes two different sequencing
syntaxes (do-notation and >>=).

~~~
wz1000
The 'do' is superfluous.

It would be much better as

    
    
        main = let greet name = putStrLn ("Hello, " ++ name)
                    in getLine >>= greet

------
millstone
I read the article, and it annoyed me, because the rebuttal is obvious and
wasn't addressed.

The rebuttal is do-notation. The second program you write after Hello World is
going to use two IO actions instead of one, and so you need a way to sequence
them, and every tutorial is going to do that with do-notation. And suddenly
all of the other syntax you learned, like how to declare a variable with let-
in, or how arrows go -> that way, or how to call a function, is tossed out the
window. And you'll spend ten minutes before you realize your indentation is
wrong and then ten more fighting with how you declare a variable and then
apply a function to it before it dawns on you, this is monads.

You don't have to use do-notation, sure. You can use >>= as an infix operator.
It's still totally confusing.

The reality is that monads hit you hard and fast as soon as you want to do any
IO. Not the monad laws or functors or whatever the tutorials focus on, but the
nitty-gritty syntax. This is where I got stuck and gave up (twice), before I
finally got it.

~~~
olavk
Yeah, no way around learning do-notation even for the simplest of programs.
And do-notation is kind of a DSL with a totally different feel than the rest
of the language. So getting started writing even simple toy programs in
Haskell requires you to learn two languages. No denying that Haskell is a
language with a very steep learning curve. (Or is it a very shallow learning
curve? Never understood that metaphor. But you have to learn a lot to get
started.)

However it might make sense in a tutorial or course to teach do-notation first
as a DSL for IO in order to get started. And only later, when the student have
a good understanding of the core language and type system, introduce monads
and show how IO is a monad and how do-notation can be used with other monads.

The problem with the much maligned monad tutorials is that the audience
typically is beginners who get stuck on on the IO monad and do-notation
immediately after writing "hello world". But the tutorials try to explain
monads as a general concept, which requires a pretty through understanding of
things like higher kinded types and so on. An understanding the newbie getting
stuck at "hello world" obviously doesn't have yet. Then the tutorials try to
be accessible by using metaphors like a "monad is a specesuit", a "monad is
burrito" etc. But knowing that a monad is a buritto doesn't help the newbie
who gets inscrutable compile errors after adding a second putStrLn to "hello
world".

~~~
bmh100
In operations management, a "learning curve" is an efficiency curve plotted
over time. I have most often seen it as a time-per-task-completed curve. The
way it works is this:

The time it takes to produce widget 2 ^ t is BASE_TIME * LEARNING_RATE ^ t.

Example with 120 minute starting time and 90% learning rate:

Widget 1 (2 ^ 0) takes 120 (120 * 0.9 ^ 0) minutes to produce.

Widget 2 (2 ^ 1) takes 108 (120 * 0.9 ^ 1) minutes to produce.

Widget 4 (2 ^ 2) takes 97.2 (120 * 0.9 ^ 2) minutes to produce.

Widget 8 (2 ^ 3) takes 87.48 (120 * 0.9 ^ 3) minutes to produce.

Example with 120 minute starting time and 70% learning rate:

Widget 1 (2 ^ 0) takes 120 (120 * 0.7 ^ 0) minutes to produce.

Widget 2 (2 ^ 1) takes 84 (120 * 0.7 ^ 1) minutes to produce.

Widget 4 (2 ^ 2) takes 58.8 (120 * 0.7 ^ 2) minutes to produce.

Widget 8 (2 ^ 3) takes 41.16 (120 * 0.7 ^ 3) minutes to produce.

If you plotted a curve of minutes per widget over time, you would note that
the graph for the 70% learning rate would have a much steeper slope than that
of the 90% learning rate.

Looking at that you might say, "ah, high learning rates are great!" All else
being equal, you would be correct. However, real world tasks with high
learning rates often imply high base times and a proportionately large amount
of time until proficiency or mastery is reached. That's why "steep learning
curves" are "bad things", because it is going to take "a lot of effort" just
to "get good" at the skill.

To bring the point home, here are some contrived examples of "90%" and "70%"
skills:

Version control:

90% - copying and pasting files

70% - git

Word processing:

90% - Microsoft Word

70% - vim and latex

Databases:

90% - Excel documents in a shared folder

70% - PostgreSQL

Web design:

90% - your web host's website builder

70% - Ruby on Rails

70% - hand coding in notepad

This is just an introduction to the concept as I learned it. Two skills might
have the same learning rate, but one might be much more productive than the
other. Still this should help you understand the origin of "steep learning
curve" and why that's usually considered a "bad thing."

------
wyager
Here's how I like to explain Monads to people. I've been told it's a decent
explanation. (It ignores the monad laws and such, but it's a decent conceptual
overview.)

OK, so you know Java interfaces? Haskell has something just like that. They're
called "type classes", though. But it works pretty much the same way. When you
write out a class definition, you write A) the name of the class B) all the
functions the class has to support and C) the types of those functions. (Like
a Java interface definition.)

Haskell has a type class called "Monad". To "implement the monad interface",
in Java parlance, you have to implement two functions. The first is called
"bind", but it's written ">>=". Here's its definition:

    
    
        >>= :: m a -> (a -> m b) -> m b
    

Let's say our Monad was called `Foo`. In Java-land, this means the function
takes a `Foo<a>`, a function that takes an `a` and returns a `Foo<b>`, and it
uses those two things to return a `Foo<b>`.

The other function is "return", which is defined as

    
    
        return :: a -> m a
    

Which, in Java terminology, means it takes an `a` and returns a `Foo<a>`.

If you implement those two functions, you now have a Monad.

If you've been told that Monads do anything in particular, you've been told
wrong. Monads don't necessarily have anything to do with IO, or sequencing, or
state, or anything like that.

It _just so happens_ that those two functions (">>=" and "return") are _super_
useful for representing a whole lot of common things, which happen to include
things like IO and stateful algorithms. Monads don't _have_ to do either of
those things, but they certainly can.

~~~
tjradcliffe
Every explanation of every Haskell feature I encounter reads the same way as
this. You define "return" as "<code>return :: a -> m a</code>" and say this
means it takes "an a" (by which I assume you mean an object of type a?) and
returns a "<code>Foo<a></code>" (an object of type <code>Foo<a></code>?).

The problem I have is: what on earth does this have to do with 'm'?

Is 'm' equivalent to 'Foo'?

If so, why don't you mention that? It isn't obvious to a newb who is lost in
the welter of syntax.

Does 'm a' mean the same thing as <code>Foo<a></code>? If so, why not use
'Foo' in your Haskell rather than 'm', which is simply introduced without any
description, mention or explanation? Is there some significance to the naming?
This is the kind of question that a newb specifically cannot answer, and
Haskell tutorials are notoriously lax about. They have a strong tendency to
assume you know the syntax of the language, which they use to explain the
syntax of the language.

You've given a nice, detailed explanation, but have failed to describe what
'm' is, and that means your average newb (me, or possibly even someone much,
much smarter than me, as I'm told such people exist in profusion) is left
making guesses. I'm pretty sure I know what the correct guesses are, but I
really want to know: why did you use a completely different, unrelated, and
opaque name for 'Foo' in your Haskell example, and then talk exclusively about
'Foo' in your explanation?

This isn't really a knock at you personally: as I said, every Haskell tutorial
I've encountered has similar issues. I'm just deeply curious as to why this is
the case (and painfully aware that my own explanations regarding other
languages are likely as opaque to newbs as these Haskell explanations are to
me.)

~~~
kazagistar
The fundamental problem is higher order types. When you define an interface in
Java or C#, it has no way of referring to itself. Something like "compareTo"
cannot say "the other type must be exactly equal to this". So lets invent such
a syntax (using a keyword "as") and write the Monad interface:

    
    
        interface Monad as m {
            const m<b> bind<a,b>(m<a>, Function<a,m<b>>);
            const m<a> return<a>(a);
        }
    

If you fill in the implementing class for m (like IO or List or Maybe), you
get the actual type signatures that would maybe be valid in C# or something.

------
barosl
I've also experienced similar failures. I've tried to learn Haskell several
times, and every time I meet the Monad section in the tutorial I lost my
concentration. One of my friends who is proficient in Haskell told me to just
start coding without trying to understand the advanced concepts, but I wasn't
inclined to do that as Monads were hanging over my head, and I really wanted
to understand what they are before writing some piece of code.

Funnily enough, things were changed after I learned Rust. (I know I'm saying
like a Rust zealot, but it's true!) After using Rust's `Option` type, and
realizing its `.map()` and `.and_then()` are similar to `fmap` and `>>=` in
Haskell, I finally began to grasp what functors and monads are. And that led
me to understand the "implementation details" of Haskell. Before that, I
thought `seq` was purely written in Haskell. But it wasn't. It is basically a
compiler magic, and GHC specializes `seq` to allow eager evaluation. The same
thing goes with the IO type; the monad itself has nothing to do with making
I/O be done. I/O is possible solely thanks to the implementation details of
the IO type, which resides in GHC. Plus, I can also guess that that part of
GHC is probably not written in Haskell, because it is not possible to make
side effects in "real" Haskell.

So, Rust was my admirable Haskell teacher. (Thanks, Mozilla!) It's like a
stepping stone between the imperative world and the functional world. The
funny thing again is, after having fun with Rust for months, now I have less
reasons to go back to learn Haskell... Rust is already _too_ powerful,
performant, and easy to understand, so Haskell feels less appealing to me.
Haskell had been enviable to me in the past, but now it isn't anymore,
(un)fortunately.

~~~
DanWaterworth
There are lots of things in Haskell that I miss when I use rust, so I'd still
encourage you to learn it. For example, concurrency in rust is OK, certainly
nicer than other imperative languages, but Haskell has easily an order of
magnitude more awesome to offer.

I recently wrote a library for testing concurrent code. It's basically fuzz
testing of scheduler behaviour to root out race conditions. In most languages,
you'd need a special compiler or runtime to do this, in Haskell it's a
library.

------
hobs
I can definitely say that my short sidequest to Haskell began and ended with
trying to figure out what a monad was, and concluding that this was a silly
first topic as well. I am glad I wasn't the only one.

------
platz
The thing that makes an 'IO t' _data structure_ different from some arbitrary
java bytecode, is that bytecode is an opaque black box, and represents
commands that will be executed immediately, whereas an 'IO t' just "describes"
the intended action, remains composable and is a first-class value (i.e. _data
structure_ ) in your program.

It's necessary to draw a distinction between the "runtime" and "evaluation" in
Haskell; it's not necessary afaik to draw this distinction in other languages,
which I think is where the confusion comes from.

In Haskell _evaluation is pure_ , the runtime is not, but evaluation is what
allows you to reason about your program.

~~~
millstone
IO t is a black box too. If it's a data structure, it's a crappy one: you
can't pick it apart like a list, inspect it, or do anything except sequence
it.

I didn't follow what you meant about "runtime vs evaluation." To me they seem
hopelessly intertwined: if you evaluate head [], you get a runtime error.

~~~
pestaa
It's not a black box in the sense you can move it around, evaluate later, not
evaluate at all, etc. There is fundamentally nothing you can inspect about a
general IO operation: it's a handle to a computation dealing with the external
world.

~~~
millstone
I'm pretty sure you can move black boxes around in the real world too :)

There are definitely sensible things to do with "general IO operations." For
example, consider `withArgs`, which replaces argv for the duration of an
action. A natural approach is to pick through the received action, and replace
all calls to `getArgs` with `return tempArgs`. But `withArgs` doesn't work
this way because Haskell can't pick apart IO actions. Instead it uses a nasty
FFI to modify a global. (It doesn't even look thread-safe: I'll bet withArgs
"leaks" into other threads.)

It's also overbroad: IO encapsulates "real world" computations like deleting
files, but also basic operations like getting argv! I'd argue it's a failure
of Haskell that the type system does not distinguish between a function that
erases your hard drive and a function that gets your command line arguments.

~~~
pestaa
You either enforce 100% purity, or your language is not pure. Haskell chose to
walk the former path. In that case reading global state needs to happen in the
IO monad even if it is reading a couple of command line arguments. Though I
completely agree it is nasty to change them in runtime, it is also assumed you
know what you're doing when you tinker with them (cross-thread implications
included.)

To a Haskell program, command line arguments are exactly the same outer world
as your super important files.

------
lclarkmichalek
Ok, so we can't use the word monad because that's bad apparently. How do we
answer "How does Haskell, a FP language where composition is a very important
concept, compose IO actions?". Because that's almost always done using monadic
actions.

Aside, I agree on the List monad thing. I never ever use the list monad, why
would I. But if I'm managing state (i.e. IO, Reader/Writer, Conduit etc), then
why wouldn't I use the name of the language construct that Haskell provides to
manage state?

~~~
tome
> How do we answer "How does Haskell, a FP language where composition is a
> very important concept, compose IO actions?".

You can answer "By using the combinators provided for composing IO actions".

The fact that the structure encoded (some of) these operations happens to
correspond to a monad is something that can be left unmentioned until later.

------
jitl
I wish the author had explained what the difference between using the IO type
and using the IO type via the monadic interface looked like. As a Haskell
beginner, this article just makes me more confused. Can someone share an
example or clarify?

~~~
kazagistar
If you are starting Haskell, there is a wonderful little function you can use
called `interact`, which has a type `(String -> String) -> IO ()`. This takes
a String to String function, and turns it into an IO type that reads from
STDIN, runs the function, and writes to STDOUT, lazily. Thus, you can just
write `main = interact myFunction` and get everything you need to build
complex Unix pipeline utilities without needing to worry about any impurity or
composition. I suggest you stick to this until you are comfortable with the
basic syntax, types, lazyness, and typeclasses of Haskell.

The problem with learning about the monad typeclass first is that it is a very
very high level abstraction, and abstractions are near impossible to
understand without a working understanding of examples that they might
abstract.

~~~
jitl
This sounds amazing. I was one of those beginners who thought I had to learn
monads right away to get anything done, became frustrated, and never
progressed.

------
chris_wot
"Only a Sith deals in absolutes" \- I see what you did there, you Sith Lord
you!

------
jkot
"Avoid success at all costs" comes to mind.

~~~
tel
As per usual, it was meant to be read as "Avoid: (Success at all costs)" not
"(avoid success) (at all costs)".

~~~
dllthomas
My understanding is more that it was meant to be read both ways, but with the
former understood to be serious and the latter a joke.

------
tjradcliffe
This is a fair reflection on one of the biggest barriers to newbs learning
Haskell (which I've tried to do a couple of times and failed, in part because
the syntax is weird and poorly explained, in part because the tutorials suck,
and mostly because I'm not very smart, or so I'm told.)

The Haskell newb gets told two things very early on:

1) Haskell is pure pure pure!

2) Haskell can do IO via monads!

These statements are contradictory. Haskell is either pure, or can do IO (via
monads or anything else.) No pure language can do IO. Period. End of story.
Any claim to the contrary is simply, utterly, completely false. I don't care
how much you wish it was true, or think it's possible to finesse the issue.
You can't. And the worst thing you can possibly do is introduce a newb to the
new language via a contradiction.

It's like idiots who introduce "negative resistance" with the up-front claim
that a device has "negative resistance" without first introducing the
generalized concept of resistance that the notion of "negative generalized
resistance" depends on. Ordinary, unqualified "resistance" means V/I, not
dV/dI, and if you use the unqualified term "resistance" in any other way you
are deliberately and with malice aforethought creating confusion. Same goes
for "negative temperature", doubled.

Want to teach Haskell? Start like this: "Haskell is an effectively pure
language that hides side-effects behind opaque interfaces that return new
objects that represent the modified state of the world, rather than old
objects that have been modified. This trickery lets us reason about Haskell
code in most cases as if it was pure _even though it is not_."

Claiming purity where there is none is terrible pedagogy. The newb will be
confused and have a much harder time getting their head around the language.
They will realize that the insiders are working in a different conceptual
universe than they are, but won't see any way into that universe and will
rightly assume that the insiders are being deliberately obtuse for the sole
purpose of excluding newbs.

I am, as stated above, extremely stupid. As such, I am pretty ordinary in this
regard. If you want people to learn the damned language, you have to cater to
stupid people like me. So discard, play-down, reject and ignore the myth of
purity, because that's what it is. A myth. Haskell is not pure. In a pure
language the return value of a function depends only on its arguments, but any
function that does IO can return any value whatsoever, regardless of what its
arguments might be.

I'm sure Haskell purists will respond to this insisting that no, Haskell is
really, truly, purely pure... for a certain definition of "purity". It isn't.
Monads are a hack to cover necessary impurities. They are a good hack, but to
redefine "pure" in such a way that accommodates monads--and this is what you
have to do--is just like redefining "resistance" to accommodate "negative
resistance". It's perfectly reasonable to generalize the definition of common
terms, and perfectly confusing to use the generalization to imply that you
mean "just perfectly ordinary resistance" when talking about "negative
resistance," or "just perfectly ordinary purity" when talking about "purity
with side effects hidden behind monads."

There is no particular value, use or virtue in claiming Haskell is "a truly
pure functional language". Claiming the language is "effectively pure for most
practical purposes, so you get the provability of pure functional programming
by hiding side effects behind monads" seems far more useful. Why the Haskell
community is so wedded to the myth of purity--and therefore so insistent about
putting monads front and center, as the article rightly decries--is not clear.
But the language and the community would be better for putting the myth and
the monads behind them.

~~~
tel
To go ahead and be that Haskell purist... in normal Haskell code there are no
functions which execute side effects in the process of computing their
results. Even a function like

    
    
        putStrLn :: String -> IO ()
    

merely produces an IO value. It is only when such a value is sequenced into
something presented to the RTS as `main` that the effects are realized.

This seems like pedantry, but it allows you to construct and compose IO values
in many ways (besides just sequentially evaluating them). This ends up giving
you "macro like" powers.

~~~
tjradcliffe
Thanks for this answer. I think it points clearly to the source of confusion.

To be pedantic back at you, I'm pretty sure your explanation translates to:
"So long as you never run a program, Haskell is a pure language."

Pedantically speaking, you can't have it both ways. Either the "Haskell" of
the claim "Haskell is pure" includes the RTS or it does not. If it does, it is
not pure but can run programs. If it does not, it is pure but cannot run
programs.

A programming language that you can't run programs in is not very useful or
interesting, so I'd really think it was a better move to define "Haskell" in
such a way that programs can be run, and accept that under that definition
Haskell is impure, although it retains a kind of effectively pure formalism
that allows provability and such, which is awesome.

Either way, it is really important to emphasize in any description of the
language that "At runtime Haskell is not pure". Otherwise newbs like me will
simply be confused by the contradiction between the claim "Haskell [considered
as an unexecutable language] is pure" and "Haskell [considered as a set of
instructions to the RTS] is capable of doing IO". The two completely different
meanings of "Haskell" must be made clear if profound confusion is to be
avoided.

~~~
tel
Eh, I don't care much about names. Purity-up-until-RTS being taken seriously
makes it easy to think of computations both pure and impure. I get to choose
whether to interpret IO as an imperative, impure language subset of Haskell or
as a pure type. Both have their advantages.

If your language is impure then you usually can _only_ choose the former.

