Why Haskell is suited to IO

ankrgyl · on July 4, 2011

This rant strives to dispel two myths:

1.) Haskell is "a bad language to write IO intensive programs in"

2.) "maintenance is more expensive as Haskell programmers are thin on the ground"

I was convinced of neither.

First, just because your own (potentially biased) benchmark shows that GHC performs "very favorably" doesn't mean this is true in general. For anyone to buy that argument, you need to provide code or at least a precise description of the experiment. These guys did a good job of that: http://www.yesodweb.com/blog/2011/02/warp-speed-ahead. Also, just because iteratees are fast, doesn't mean they're "good" for writing I/O intensive programs in. They're really difficult to understand, so most people end up using lazy I/O. Part of not being bad for I/O programs is making the task approachable, and I'm not convinced of that by this article.

The argument to dispel #2 is vague too.

> If you agree that programs that are more correct are also of higher quality then the you should also agree that the cost of maintenance for Haskell programs is lower than for imperative programs.

Okay... You can still write errors in Haskell, they're just less frequent (at least that's what this statement suggests). But how can you maintain them if no one is willing to actually do it? If it's N times easier to maintain a Haskell programmer, there need to be at least 1/N as many Haskell programmers as there are imperative programmers, and the article doesn't address this at all. As someone who likes Haskell, I'm actually curious to know if this is true.

tmorgan · on July 4, 2011

Yes, the argument presented for point one is pretty weak, but I do agree with it. Having the lazy I/O, and especially lazy bytestring option is great IMO. Although Oleg style left-fold enumerators as used by warp are superior in many respects, typically as a user you don't need to interact at that level; iirc you don't in Yesod, which is built on warp. As an example of approachability LBS approach, checkout this two line "wc -l" implementation, which beats "wc -l" by 64x, apples and pears no doubt, but shows fast enough,(from http://www.mail-archive.com/haskell@haskell.org/msg18878.htm...):

> import qualified Data.ByteString.Lazy.Char8 as L

> main = L.getContents >>= print . L.count '\n'

edit: actually it beats "wc", and is roughly same speed as "wc -l".

ankrgyl · on July 4, 2011

That's really cool, and is the kind of good benchmark I was referring to. I'm curious to see how it compares to an iteratee approach. I suspect in this case that there wouldn't be much of a difference, because iteratees are suited especially well for handling multiple concurrent streams, and this deals with only one.

chc · on July 4, 2011

Actually, there merely need to be 1/N programmers capable of using Haskell. There's no separate species known as a "Haskell programmer."

eru · on July 4, 2011

There are. Some people, who after having tasted the forbidden fruit, refuse to program in lesser languages.

DanWaterworth · on July 5, 2011

Or just get irritated with lesser languages. I know I do this.

MostAwesomeDude · on July 5, 2011

What I always found interesting was the implication that programs proven correct are also free of logic errors more of the time, and worse, that they are free of algorithmic complications. To be fair, the latter point is slowly being addressed, although it rips massive holes in the purity of Haskell and Scheme. I mean, who wants to see explicit tail recursion all the time, or mutable large data structures? But the alternative is slow code.

I think that Haskell's forced purity is roughly as good at catching logic errors as the strict typing of C++ or the clean syntax of Python: It catches some stuff, some of the time, but never everything, and never reliably. At some point, the onus is on the programmer to write good code.

orangecat · on July 4, 2011

People's ravings about Haskell are very effective at revealing my inner Blub-ness:

The pattern boils down to creating an expert system that is pure and encodes a state machine and creating a system within the IO monad that operates by querying the expert system. Both parts would be written in a synchronous style, but would use coroutines to mask the passing of control flow.

Um, awesome. You do that, and I'll continue to "print 'Hello world'".

kemiller · on July 4, 2011

I think the basic issue most people have with Haskell (or rather, with Haskellers) is the fetishization of 100% correctness. For most tasks, it's just not that important, and so the (considerable) extra cognitive load that Haskell imposes is a net negative.

I mean, precise joinery may be superior to hammers and nails when building a house, but it's an expensive proposition, and there are 5 people in the world who can do it properly.

Most of us monkeys, when coding, are aiming to accomplish some task in the notoriously incorrect real world. It is often much better to spend 1/5th the engery get to a 80-90% correct solution very fast, and then patch as necessary, for the very important reason that you don't even know if what you're attempting to do will work. In that mode, like carpenters, sometimes it's helpful to just bang it with a hammer.

If what you do is write compilers, well, OK. The description you're quoting sounds pretty good. But the world doesn't need tons of compiler writers, and those it does need typically have pretty stable specs in front of them, so maybe joinery is the right approach.

I think this guy gets it about right: http://alarmingdevelopment.org/?p=440

tmorgan · on July 4, 2011

An amusing analogy, sure, but I can't help be reminded of this xkcd http://xkcd.com/568/ - "you'll never find a programing language that frees you from the burden of clarifying your ideas". Haskell just makes that trade-off more 'up front'. Obviously it's not the right tool for every job mind, of course sometimes quick and dirty scripts win.

kemiller · on July 5, 2011

Well, sure, but frequently I don't want to clarify all my ideas up front. I want to dip my toe in and see if my ideas have any legs at all. (Wow, tortured metaphor.)

I actually like Haskell. But I do think it's the wrong tool for a lot of the work that people here on HN tend to do.

eru · on July 4, 2011

Yes.

In Haskell, you pay upfront.

For example, expressing any control flow as a monad may be harder than just muddling through, but your components are guaranteed to compose afterwards.

mattdeboard · on July 4, 2011

Here's what I don't get about Haskell's.. philosophy I guess.

(Disclaimer: I've only spent about a week learning me a Haskell for great good, and that was a couple weeks ago, so I'm not an authority by any measure.)

The Haskell community uses the word "pure" a lot, and I don't get it whatsoever. Even this musing says it's "easier to write correct pure code than correct imperative code." What does that mean? Is "pure" code the opposite of "imperative" code?

mitchellh · on July 4, 2011

So, "pure" is the opposite of "impure" and "lazy" is the opposite of "strict" (but often cited as "imperative). Haskell is a pure, lazy functional programming language. I'll talk about purity here, and not about laziness.

Pure means that functions are not allowed to have side effects. More formally, a pure function can only use its inputs (parameters) to generate its output. Therefore, a pure function can't modify a global variable, print to the screen etc. because those are all side effects (more on this later). The benefit of this is threefold:

* Testing is much safer and generally easier. You just have to test the various inputs and edge cases and you can pretty much guarantee the function is 100% correct. You can't get this guarantee in impure languages.

* Inherently concurrent. Since functions don't modify any global state, you're free to make this as concurrent as you want without worrying about "thread safety" and all that.

* Increased modularity. Pure functions are highly portable, since they don't modify state they can be used for any library if needed, really. In an impure language, you often have to modify functions for use in a more general sense as a library.

Of course, the real world has side effects. Almost every program we run is a side effect, since we're giving it input from the world and it uses that to print to a screen. So how can a program be pure at all???

Without going into too much detail, there are ways in Haskell to isolate impurity to a small part of the code which is clearly marked as impure. This impure code is responsible for say, reading a string from the user input. This string can then be used to make function calls to pure functions.

So the way you do things is to put all your business logic and such in pure functions, then the impure functions just deal with shuttling values to and from the real world. You can test pure functions with theorem checkers and you test impure functions with unit tests and integration tests.

Haskell has a steep learning curve since it has so many concepts which still simply haven't made it into more industry-standard languages. This is getting better with language such as Scala becoming more mainstream.

I really love Haskell and even if I don't use it on my day-to-day, it is great to hack out a Haskell program every once in awhile.

Hope this helps!

- Mitchell

ankrgyl · on July 4, 2011

Great explanation. Just a small point though,

> "lazy" is the opposite of "imperative."

should be: "lazy" is the opposite of "strict" or "eager". "Imperative", in this context, is the opposite of "functional".

chc · on July 4, 2011

It's basically impossible to have a lazy imperative language, though, because "a sequence of statements" is the definition of an imperative program, and laziness leaves the ordering of that sequence undefined.

ankrgyl · on July 4, 2011

Scala pulls it off. Here is one example: http://www.bluishcoder.co.nz/2006/07/scala-futures-and-lazy-...

chc · on July 5, 2011

I don't mean to split hairs, but Scala isn't a lazily evaluated language — it's a strict language that allows optional lazy evaluation, and AFAIK it only really works well if you're programming Scala in a more or less functional style. I'm pretty sure Scala's lazy values still lead to unpredictable behavior if you lazy up something that has side effects.

michaelschade · on July 4, 2011

Essentially implied by what Mitchell has said, but it's good to point out that pure functions have referential transparency, which means that for a given input, the function will always give the same output. (So, `myFunc 7' always yields the same answer), which makes it easier to reason about your code and have more reliable tests against these functions.

More reading here: http://www.haskell.org/haskellwiki/Functional_programming#Pu...

mattdeboard · on July 4, 2011

Great explanation, actually. But one question. I thought functional languages were "pure" throughout. I mean, I thought the whole point of a functional language is that there are no side effects and every expression returns some value. How is "pure" distinct from "functional"?

edit: BTW, Haskell DOES have a steep learning curve, and the inherent recursion breaks my brain. Even simple things like:

`combos = []` `combos = combos + 'foo'`

hurt, because counterintuitively (to me, anyway) this winds up in infinite recursion. I'm very used to OOP so this sort of thing is hard on me. I'm old though.

edit: I'm a little bummed out by the downvotes on my original question. It was an honest academic question I had and was answered splendidly. I was not attacking Haskell whatsoever. I apologize if anyone took it as anything other than an honest question.

mitchellh · on July 4, 2011

  Great explanation, actually. But one question. I thought functional languages were "pure" throughout. I mean, I thought the whole point of a functional language is that there are no side effects and every expression returns some value. How is "pure" distinct from "functional"?

Oh, no no. Pure is not required for functional programming. Functional programming is mostly defined as the ability to pass around functions as 1st class data types. In fact, I noticed on your GitHub you do some Python. Here is functional programming in Python:

  # Sum a list in Python using FP
  items = [1,2,3,4,5]
  function = lambda x, y: x + y
  reduce(function, items) # => 15

  # The equivalent in haskell:
  foldr1 (+) [1,2,3,4,5]

Pure is a stricter form of functional programming, which Haskell enforces.

  edit: BTW, Haskell DOES have a steep learning curve, and the inherent recursion breaks my brain. Even simple things like:
  `combos = []` `combos = combos + 'foo'`
  hurt, because counterintuitively (to me, anyway) this winds up in infinite recursion. I'm very used to OOP so this sort of thing is hard on me. I'm old though.

You're stepping now into the "lazy" aspect of Haskell. You're allowed to have infinite recursion in Haskell because it is only evaluated when needed. I don't really want to get into it here since there are many resources online about it :) But this also isn't inherent to functional programming or pure function programming.

mattdeboard · on July 4, 2011

Interesting, thanks for taking the time for in-depth explanations.

eru · on July 4, 2011

Also you don't need to use recursion too much, when writing Haskell code. You should stick to using combinators over your (recursive) data-structures. Think fold, map, filter, and friends.

Alas, you still have to be able to understand recursion to get anywhere.

kemiller · on July 4, 2011

"Pure" in this context is basically short for "purely functional." Most functional languages have exceptions -- notably for I/O. Achieving 100% functional behavior is, it turns out, really really hard, and that's what makes Haskell notable.

yummyfajitas · on July 4, 2011

Pure means that functions have no side effects - a function is purely a mathematical object defined by it's return values.

For example, f(x) = 2*(x+1) is pure. f(3) == 8, and it is completely safe (in terms of yielding the proper return value) to replace f(3) with 8 anywhere in your program.

Similarly, print world "foo" is referentially transparent - print world_in_which_foo_wasnt_printed "foo" == world_in_which_foo_was_printed. It's completely safe to make the substitution (restOfProgram (print world_in_which_foo_wasnt_printed "foo")) -> (restOfProgram world_in_which_foo_was_printed).

Referential transparency makes it easy to reason about programs. A function of type Int -> Int will never touch a variable of type OutsideWorld, and can always be replaced by it's output.

For those wondering what this has to do with monads, it's fairly simple. A pure program without monads requires a lot of plumbing concerning the world variable:

    f: World, Int
        world, result <- action1 world 3
        world2, result <- action2 world
    ...
    return (worldN, result)

Monads are merely syntactic sugar to suppress the world variable.

[edit: fixed formatting.]

dons · on July 4, 2011

Pure code is code with no side effects. The result of a pure function only depends on its input parameters, and not the state of the world. Such code is by-default thread-safe, composable and tends to be simpler to understand than code with multiple interactions with external state.

It also admits many optimizations in the compiler, garbage collector and runtime system that are not possible for code without such safety guarantees.

It is this IO and thread throughput and scaling results that are being hinted at in this post, I think. And projects like Snap and Yesod are taking advantage. http://www.yesodweb.com/blog/2011/03/preliminary-warp-cross-...

neilk · on July 4, 2011

An interesting argument, but how about some examples to back it up?

DanWaterworth · on July 5, 2011

All in due time.