
Translating a C++ parser to Haskell - yomritoyj
http://www.haskellforall.com/2017/06/translating-c-parser-to-haskell.html
======
Veedrac
The C++ code here is exactly the kind of thing that game developers complain
about when they say C++ encourages inefficiency. Calling Haskell fast because
it merely draws with that doesn't really make sense in this context. Yes, it's
as fast as C++ code that doesn't care in the slightest about efficiency, but
people use C++ for the times when they _do_.

E: As some quick examples, peeking an istream in a loop is extremely
inefficient[1], yet is done pervasively. The Derivation type has a bunch of
separately allocated strings, and a lot of types like std::map<string,
DerivationOutput> are pointer filled. To check whether a string starts with
"r:", they allocate a _new string_ and compare it, rather than just doing a
normal string comparison.

[1] [https://godbolt.org/g/27o9od](https://godbolt.org/g/27o9od)

~~~
Gabriel439
What is the reason that `peek` is inefficient? Is it just because it is a
function call that is not inlined or is there another reason?

~~~
Veedrac
Really it's the other way around: peek is complex so the compiler didn't
inline it. In essence, you do a virtual call which I think winds you up in

[https://github.com/gcc-
mirror/gcc/blob/1cb6c2eb3b8361d850be8...](https://github.com/gcc-
mirror/gcc/blob/1cb6c2eb3b8361d850be8e8270c597270a1a7967/libstdc%2B%2B-v3/include/bits/istream.tcc#L620)

which itself redirects to a whole bunch of stuff.

The virtual call alone is a lot more expensive than a simple pointer
dereference, and though it's possible proper inlining might have alleviated
some of this problem, you're not particularly likely to see it come close to
something that's efficient by design.

~~~
pjmlp
What does the profiler say regarding overall program execution time?

~~~
Veedrac
No idea, I haven't timed it.

~~~
pjmlp
So not relevant.

------
pjmlp
Nice article, just one small pedantic remark.

> Note that Haskell type synonyms reverse the order of the types compared to
> C++.

This is true in the context of the article when compared with the presented
_typedef_ definitions, however the modern way to do type synonyms in C++ is
via _using declarations_ , which are very similar to the Haskell ones.

    
    
        using Path = string;
        using PathSet = set<Path>;

~~~
lordvon
"Using" also allows you to have templated aliases, unlike typedefs. There's
really no reason to use typedefs.

~~~
jchw
Well, of course, there is ONE reason: Having to use old compilers or old
standards for other reasons.

I imagine a lot of people are in that place with C++. Personally I've followed
C++ for a long time as a hobby and it seems like it's finally getting to where
it wants to be. But I'll talk to someone who works writing C++ and the idea of
being able to use C++17 is a distant dream.

For Open Source developers and hobbyists though, you might as well take
advantage of whatever features Clang+GCC+MSVC offer. I wish someone would
standardize `#pragma once` so I don't have to feel guilty using it.

~~~
lordvon
Yeah so the only reason you wouldn't use 'using' is when you can't :)

Modern c++ is awesome. I definitely appreciate being able use c++14 where I
work (as well as 'pragma once'). I really look forward to being able to use
c++17.

------
Peaker
Attoparsec is fast, but used to have bad error messages.

How do the 2 parsers compare w.r.t error messaging?

Haskell has Parsec/Megaparsec which have better error messages, but are
extremely slow.

~~~
kccqzy
I don't think Parsec is "extremely slow." It's not as fast as attoparsec, but
not bad either.

Trifecta has the best error messages, but takes some more work to fully make
the error messages great.

~~~
mrkgnao
Trifecta has poor documentation. I've been thinking about reading the Idris
parser to learn trifecta/parsers.

~~~
axman6
Trifecta is best used via the parsers library, which is fairly well
documented. This also (in theory) allows you to switch which parser you use
between trifecta, parsec, attoparsec and even ReadP. I say in theory because
the the semantics of these libraries differ.

------
logicchains
Note to the author, I found this post extremely hard to read on mobile. I had
to scroll horizontally as the text wouldn't all fit on screen, and zooming
seemed broken. When attempting to zoom or horizontally scroll it would
sometimes randomly click some invisible popout that would take me to another
page.

~~~
mrkgnao
Blogger is actively hostile to phone browsers. For horizontal scrolling, you
can try dragging at an angle (to fool the page-change JS) and then vertically
scrolling to negate the vertical component of the first motion. It becomes
pretty natural after a while :/

