

Adventures in Parsec - Making it go faster with attoparsec - shintoist
http://variadic.me/posts/2012-02-25-adventures-in-parsec-attoparsec.html

======
joeyh
shintoist, I'm glad to read these. Your target audience is not just people who
know haskell but have not built anything yet. Plenty of stuff can be built
without using a parser, and of course that tends to evolve into code that
should probably be using a parser here and there, but isn't. :) Add to that
RWH documenting only parsec, and barely being published in time to show
applicative use of it, and there's some need for newer examples.

I wish your types were more expressive. Using "" to cover up a failure, rather
than using Maybe is somewhat of a code smell for me. Similarly, if I were
parsing these logs I'd want numeric byte sizes, not strings.

Finally, I was personally thrown for a loop when you talked about using
Data.Map, but threw in a Data.HashMap, apparently for performance reasons.
Particularly because you continually insert values into the map, which is bad
news for Data.Map (should use fromListWith when using it), but is apparently
ok with HashMap, although no better than just using Data.HashMap.fromListWith.

~~~
carterschonwald
It's probably worth augmenting the tutorial to mention that you're using the
Hashmap (Data.HashMap.Strict) in the Unordered-Containers package, rather than
any other ones, as its the most performant of the versions available and thus
the one folks should use (amongst the hashmap implementations)

~~~
shintoist
Absolutely, I did experiment with
<http://hackage.haskell.org/package/hashtables>, but did not notice enough of
a difference in performance to decide it was worth using in this case,
especially with the extra visual overhead of the state monad. I had expected
it to be much faster, but I would probably chalk that up to me making mistakes
in how to use it (iirc I wrote my own insertWith).

~~~
carterschonwald
I think the hashetable package has a bit of a different performance / use case
profile than the functional hash map data structure. Also I believe you need
to spend more effort tuning which type of hashtable and your choice of hash
function based upon your use case/ blend of operations. Some of the hashtable
options regress to linear time insertions and lookups under certain use cases
and load factors

------
tikhonj
On some of the simpler functional programming articles, people are always
complaining that the examples are trivial. It's nice to see a nontrivial
article on the front page :).

------
msds
Ah, attoparsec is such a nice package! Some minor quibbles:

Why aren't you using the various combinators from Control.Applicative to build
your parsers?

Could your log files possibly contain non-ascii characters? In that case, use
the attoparsec module specialized for Data.Text values.

~~~
shintoist
The reason I didn't use applicative is because I wasn't comfortable with it at
the time. You can write some pretty nice looking code with applicative, but
you also have to get used to it first. My hope is that by sticking to do-
notation more people will be able to read it =)

