

Parsing with Haskell and Attoparsec - bhoflack
http://newartisans.com/2012/08/parsing-with-haskell-and-attoparsec/

======
lmm
So the Haskell's not actually fully parsing the data, it's returning a lazy
value that will do some of the work later? It's a perfectly good approach, but
it means the performance comparison isn't really fair - the Haskell approach
would slow down the rest of your program if you were actually using the value
you parsed.

Comparing the line count between a program that uses a library and one that
parses by hand is also somewhat unfair.

~~~
danieldk
He is only parsing the headers in the final Haskell version, and skipping the
body. In the C++ version, he also seems to check the MD5 and/or SHA1 hash (if
a hash of the body available and OpenSSL with SHA1/MD5 support is available).

So, it's not exactly clear to me if the C++ and the Haskell version are still
doing the same thing. Also, in the C++ version he is parsing stuff by hand, it
will probably a lot faster when using a proper lexer (you usually can't beat
an optimized finite state machine ;)) and parser.

Unfortunately, the emphasis on performance distracts from the main point:
parsers written with attoparsec or parsec are usually understandable, type-
safe, and pretty, while offering performance near to that of a C/C++ version.

In other words, if you have enough time, you can probably come up with a C or
C++ parser that is as fast or faster than a Haskell-based parser (assuming
some overhead of boxing/unboxing, garbage collection, more difficult
optimization, ...), but would you care if the Haskell implementation had 80%
of the performance and was far more maintainable?

~~~
Evbn
Can't just make up a number though. What if the Haskell has 10% of the
performance and uses 10x the RAM?

------
batgaijin
If you have non length variable ASCII would you still want to use Attoparsec
or Data.Binary?

