
High-Performance Web Applications in Haskell - fogus
http://qconlondon.com/dl/qcon-london-2011/slides/GregoryCollins_HighPerformanceWebApplicationsInHaskell.pdf
======
bobfunk
Can attest to Haskell and Snap being excellent for small web services where
you need a boost - especially when concurrency is involved.

Wrote two small services for <http://www.webpop.com> in Haskell, both based on
Snap. One for dynamically resizing images on the fly and one for handling
uploads of large files and streaming them straight to Cloudfiles.

In both cases I started by trying node.js (both cases deals with either
fetching or sending files from Cloudfiles so IO operations is what takes time)
since the problems seemed very suited for node's eventloop model. In both
cases node.js really disappointed.

Ended up giving Snap a try and got really surprised at just how well it
worked. Performance is amazing, it's not much more code than the node.js
version and theres no endless nesting of callbacks. It handles high levels of
concurrency incredibly well and both services ended up using around 8MB of
memory each - even under load...

Main problem is the lack of libraries that you tend to take for granted in
more webby languages, and obviously the learning curve is pretty steep (but
also very rewarding).

------
pjscott
I had a good experience writing a little web service in Haskell a couple days
ago, an autocomplete server. You give it a prefix, and it returns some number
of strings in its database starting with that prefix. I wanted it to be fast,
and lighter on memory than the approach some people have been doing with
Redis, where they store all prefixes in a sorted set:

<http://antirez.com/post/autocomplete-with-redis.html>

Sounds like a fun toy project, right? I figured I would store everything in a
trie, so I could do really fast prefix searches, and then bolt on a REST
interface that would send back JSON or JSONP. So I wrote some C code to handle
the prefix searching in a JudySL trie (see <http://judy.sf.net/> for details;
it's a nice library) and bolted that onto Haskell and wrote some bridge code.
What I ended up with was a really nice, easy server that took me only a few
hours to write, and was every bit as memory-efficient as I'd hoped. Source
code here, if anybody's curious:

<https://github.com/PeterScott/acdaemon>

The learning curve on Haskell, though, is really something. If you really want
to see something ridiculous, try looking at the auto-generated library
documentation for anything involving regular expressions. The type
declarations in Text.Regex.TDFA are like something H. P. Lovecraft might write
about.

~~~
mjcmirr
It's a fair point about Haskell's learning curve. Haskell is different enough
from the languages most people have grown up using that--at least in my
experience--there are times that simply understanding a few lines of code can
take minutes.

On the other hand, that fact has always seemed to me a mark of why it's
worthwhile to persevere: as when you are learning mathematics, things can seem
entirely opaque until they 'click', but then you've really made an advance in
your understanding.

Probably most people are aware of it, but the O'Reilly book _Real World
Haskell_ by Bryan O'Sullivan, Don Stewart, and John Goerzen, is a great way to
get started. The book is on-line at

<http://book.realworldhaskell.org/>

Chapter 8 discusses (one of) the Haskell regexp implementation(s), and in
particular addresses one of the puzzling aspects: polymorphism in return type.
The whole book is excellent.

(I should also point out that another book _Learn You A Haskell_, has just
been published, and is well-regarded. I think it assumes rather less
programming experience than RWH, and also covers less of the "real-world"
aspects. It also can be read on-line, at

<http://learnyouahaskell.com/>

and in fact there is a coupon code on that page now for 40% off.)

Also, the Haskell community is, in my experience, extremely helpful and
welcoming. People should come and join!

~~~
pjscott
The place where I've seen Haskell really shine is parsing. Parsers are
remarkably easy to write in Haskell, thanks to Parsec and its various spinoff
libraries. For example, here's a minimal HTTP request parser that Bryan
O'Sullivan wrote by essentially translating part of the RFC into Haskell:

[https://bitbucket.org/bos/attoparsec/src/tip/examples/RFC261...](https://bitbucket.org/bos/attoparsec/src/tip/examples/RFC2616.hs)

It's 54 lines long, pretty trivial, and here's a really cool part: it parses
incrementally on strict bytestrings, so you can just do raw socket recv calls
and pass the chunks of input to the parser. Similarly, I wrote a parser for
the subset of YAML that beanstalkd uses (as part of the hbeanstalk client
library), and it was just 11 lines of code. It's very impressive.

(The concurrency and parallelism stuff is also notably slick, but I haven't
had as much occasion to use it. Haskell's aversion to side-effects really pays
off sometimes.)

By the way, you're right about the Haskell community being friendly and
welcoming -- as you yourself demonstrate. :-)

~~~
alavrik
The HTTP request parser doesn't look readable or trivial to me at all.

I'm not familiar with Haskell, so to me the code looks like a mix of high-
level declarative and low-level specialized constructs (e.g. skipWhile,
takeWhile) interleaved with syntax noise. And it seems to be using quite a lot
of external libraries. Also, correspondence of the code with the HTTP spec is
completely non-obvious.

In contrast, here's an HTTP response parser I wrote in OCaml using just one
library: [https://github.com/alavrik/piqi/blob/master/piqi-
tools/piqi_...](https://github.com/alavrik/piqi/blob/master/piqi-
tools/piqi_http.ml) (see lines 26 - 218).

I've just noticed that the code you are referring to is an example. Well,
looking at the example, I can hardly come to a conclusion that Haskell shines
for parsing.

~~~
pjscott
If you don't know Haskell, and you've never used attoparsec, then I'm not
surprised that you don't find that code to be particularly clear or readable.
Haskell has a steep learning curve, as I said earlier. This is not a very
damning criticism of Haskell's suitability for writing parsing code.

By the way, that long import list is actually just importing some basic stuff
from the standard library, and the attoparsec parsing library. One of the
persistent minor annoyances of Haskell is writing long lists of module
imports.

~~~
alavrik
Fair enough. I didn't mean to criticize Haskell. In fact, I'm planning to
learn it.

I was just surprised that this code was presented as a good example of
Haskell's fit to the parsers domain. It would be interesting to see if my
perception of this code changes once I get more familiar with the language.

~~~
jamii
I write a lot of ocaml code (for money, even!) and haven't touched haskell for
about four years. I still find the haskell version here easier to read. I
imagine what's tripping you up is the operator soup. Whilst ugly (one of the
things that turned me off haskell) it is a lot easier to read once you are
familiar with basic haskell typeclasses (applicative, functor etc). That said,
the ocaml version could definitely be a lot nicer. Check out eg

Yojson (search for 'let positive_int')
<http://forge.ocamlcore.org/scm/browser.php?group_id=153>

Mirage (using MPL for zero copy parsing)
[https://github.com/avsm/mirage/blob/master/lib/net/direct/mp...](https://github.com/avsm/mirage/blob/master/lib/net/direct/mpl/protocols/ethernet.mpl)

~~~
alavrik
Hi there! I do quite a bit of both OCaml and Erlang.

In fact, I borrowed Yojson's json parser for my project (piqi.org). Also, I'm
familiar with MPL. It looks very nice, but from what I heard it is fairly
immature. And you definitely can't parse HTTP this way :)

------
acconrad
Could someone comment on Haskell in comparison to Erlang, Clojure or Scala? I
felt like when I started choosing between Ruby on Rails and Python/Django,
they were so similar that choosing either one would be a fine choice. Now that
I'm interested in functional languages too, it seems that comparing Haskell to
Erlang is just not the same as Python to Ruby.

~~~
jamwt
They're so different and yet so similar it's probably best taking them one
topic at a time. I've written nontrival things in all of those except clojure,
but I think I still have a pretty good idea what it's about from a bit of
playing around, reading the notes of others, time spent in PLT Scheme etc. I
hope this is helpful for some people looking to dive in--remember, all of this
"IMO/IME".

Type Systems:

Haskell and Scala are closest in that they're strongly, statically typed with
Hindley-Milner type inference. Erlang is dynamically typed as is clojure. This
tends to break the advantage/disadvantage scheme along the same lines as
Java/Ruby wrt typing: the former languages generally give you more speed and
safety, and the latter languages can do cool tricks with deciding typing at
runtime, which can aid in expressiveness.

Type System Complexity/Power:

Haskell has the most sophisticated type system--by far, I think it's fair to
say. This adds power but at the cost of a steeper learning curve for people
unused to thinking so deeply about the contracts/promises code and data
structures are making. Scala's is also fairly involved since it's basically a
subset of Haskell's (well, plus traits and inheritance blah)--but not quite as
consistent or clear. Clojure and Erlang have somewhat simpler type system that
may be more tractable out of the gate--Erlang in particular is pretty
straightforward.

"functional-ness", how sharp a break from OOP/imperative is imposed:

Scala is by far the least dogmatic about doing thing in a "functional" as
opposed to an OOP or imperative way. Scala's heavy emphasis on seamless
interop with Java makes it a sort of up to the programmer to place their style
on the continuum between something like Java and something like ML. Haskell,
with its purity-by-default, is probably the most opinionated of the group. You
will be forced to approach problems very differently. I'd say clojure follows
just behind, also having a strong bias toward doing functional-only styles.
Erlang is also pretty interested in doing things in a particular way that
could be construed as functional (not rebinding names, focus on immutability,
for example), but its "big idea" is more tailored around its flagship library
and virtual machine.

Syntax:

Erlang will probably feel the most foreign to a 2011 programmer. Erlang
started as a modified Prolog, and the syntactic legacy of Prolog is not
widespread in common contemporary languages. Haskell's syntax is rather clean
and restricted--though the fondness of veteran Haskellers for custom operators
can sometimes be overwhelming for a new programmer. Scala, while a hybrid OOP
language that shares many idioms with Java etc, has so heavily dipped into the
operator well, even for standard language constructs--the syntax can be a bit
crowded. Imagine Java with twice the non-alphanumeric density. Clojure is a
LISP, so it's very clean and clear syntactically, provided you are a
parenthesis master or make use of SLIME, etc.

Speed (average job, single core, not scalability etc etc):

Haskell, Scala, Clojure, Erlang--descending. Though on certain workloads these
players will switch places, as a general rule of thumb that's my aggregated
observed performance order in "real world-ish" code scenarios. Clojure is
sorta interpolated based on experiences others have relayed to me and online.
From a memory standpoint, Haskell is substantially stingier than the rest.
This impacts startup times and the practicality of shell tools, etc.

Compilation/VM + Concurrency:

Haskell is alone in this group in that it compiles to native code--well, at
least, the most commonly-used configuration does: the compiler ghc. GHC is a
beauty of a project. Very good native code generation, aggressive improvement
projects (new I/O manager, LLVM backend) incredibly advanced concurrency
capabilities (sparks, first class green threads, seamless async IO). Erlang
uses a VM, beam, which also uses green threads (which erlang calls processes)
and transparent async I/O--but, erlang goes even further to provide syntactic
support for super-efficient message-passing between these processes, and
facilities to transparently pass those messages between machines in a cluster.
Clojure and Scala both use the JVM. It is what it is--mature, fast, but
primarily designed for Java and things a lot like Java. Cool features like
tail call optimization and green threads are simply not possible b/c the JVM
does not (currently) support them. (I know Scala has Actors, Akka, etc, but
JVM-based actors are not in the same league as what Erlang and Haskell have.)

All of these languages have a REPL, thank god.

Breadth of Libraries:

The JVM languages here obviously have a HUGE wealth of libraries available.
The only concession is calling them can be somewhat awkward and "downgrade"
you out of the functional idioms you'd rather be using, depending on the
library. Haskell's library situation is pretty good (Hackage has libraries for
most everything you need and is growing quickly every day), and Erlang is
probably bringing up the rear, here. OTP is excellent, and there are many
libraries related to large networked systems etc, but not as much "general"
coverage.

Key Strength:

If you want to write high-uptime distributed systems, Erlang/OTP is the bees'
knees. Everything about the language and runtime is tailored to that
environment--supervisor trees, mnesia, good scheduling of hundreds of
thousands of lightweight processes.

Haskell is great at correctness. Like its non-lazy cousin ML, if you satisfy
the type system, a shockingly high percentage of the time your program is just
flat out correct the first time you run it. Haskell is also pretty darn good
at single-machine multicore concurrency.

Clojure is a great "modern Lisp". Many of the concerns, fairly or unfairly,
leveled at Lisps in the past (no libraries, stagnant platform, closed
development, etc) are non issues. Building on the JVM enables a very small
developer core to make progress on the language while leveraging a mature
platform that evolves independently and at scale.

Scala is a "better Java". While iterop with large Java-oriented frameworks is
_possible_ in Clojure, it's pretty much trivial in Scala. You can dip a toe
incrementally into the functional world while keeping retaining your existing
projects, favorite libraries, 3rd party frameworks, etc--transitioning as fast
and far as appropriate.

Summary:

The good news is, all of these languages are pretty damn good, and all of them
actually have healthy, thriving developer communities and razor-sharp
maintainers.

If someone has never done functional programming before, I'd personally
recommend trying something like clojure or Racket first to get the
fundamentals down before digging into something truly mind-bending like
Haskell. Scala is probably not a big enough leap to discipline yourself to
grok functional just b/c it's so easy to relapse into non-functional patterns
and Scala will happily comply.

~~~
chalst
For correctness, both Haskell and Erlang have Quick Check, which is simply
fantastic. Quick Check is somewhat easier to use with stateless code, which
favours Haskell over Erlang.

<http://www.cse.chalmers.se/~rjmh/QuickCheck/>

~~~
gtani
For more correctness, scala has had interop hiccups

<http://lampsvn.epfl.ch/trac/scala/ticket/2991>

[http://stackoverflow.com/questions/3313929/how-do-i-
disambig...](http://stackoverflow.com/questions/3313929/how-do-i-disambiguate-
in-scala-between-methods-with-vararg-and-without)

(and does Orig. Commenter have examples of clojure hiccups? Interop is _very_
important to both (and F#). Maybe a more meaningful compare/contrast is
haskell, scala and F# (which I can't do)

<http://matt.might.net/articles/best-programming-languages/>

\---------------

<http://james-iry.blogspot.com/2010/05/types-la-chart.html>

(his league table of "Most Powerful Type Systems": Agda, epigram, scala,
haskell

------
astrofinch
Hm, this isn't such a good recommendation.

<http://snapframework.com/>

(Right now there's an internal server error.)

~~~
Argorak
Maybe you sent it the wrong type of request :).

