
What's in a Parser Combinator? (2016) - pythux
https://remusao.github.io/posts/whats-in-a-parser-combinator.html
======
ckok
Every few years I look at the latest parser combinators, parser generators,
compiler compilers. And every time they seem to lack in some huge vital way
(not always all of them, but always at least 1):

* Error handling always ends up being non-existent or of the quality of "begin, for, if, while, repeat, identifier, number, float expected" with no good way to override what happens

* Recovery is usually impossible

* Parser generator generates a full model that doesn't match what we need

* Working around the quirks of the input language ends up being more tricky than hand writing (almost every language has some ambigiuty)

* Slow: With ANTLR it's _really_ easy to make it do gigantic amount of look aheads in complex languages, which isn't even really needed

I always end up going back to a simple hand crafted parser which is easier to
read and write.

~~~
bmichel
By curiosity, did you try Lezer?

[https://marijnhaverbeke.nl/blog/lezer.html](https://marijnhaverbeke.nl/blog/lezer.html)

~~~
carapace
That is keen!

------
salimmadjd
Graham Hutton recently did a tutorial via Computerphile YT channel on
functional parsing [0].

Where he starts from the scratch and explains the process of creating your own
library for parsing.

The YT video description has the link to the full version of the library that
he starts creating in his tutorial. It's kind of an elegant Haskell programing
that I don't think I'll ever achieve. [1]

[0][https://www.youtube.com/watch?v=dDtZLm7HIJs](https://www.youtube.com/watch?v=dDtZLm7HIJs)

[1][http://www.cs.nott.ac.uk/~pszgmh/Parsing.hs](http://www.cs.nott.ac.uk/~pszgmh/Parsing.hs)

~~~
jmeister
If you want more such elegant Haskell/math check out Conal Eliott( conal.net )

------
anarchyrucks
Parsing from first principles by Saša Jurić [0] is a good video on writing
parser combinator in Elixir.

[0] [https://youtu.be/xNzoerDljjo](https://youtu.be/xNzoerDljjo)

------
pubby
Parser combinators are just recursive descent. That's it. They're just
recursive descent - an idea that's been around since the 1960's.

A parser combinator library is just fancy marketing speak for "recursive
descent utility functions". All it is is a a bag of commonly used patterns
wrapped up in generic functions.

The hoopla is overblown.

~~~
fanf2
And Pratt parsers are just precedence climbing, but I’ve rewritten unifdef’s
precedence climbing expression evaluator in Pratt style and it has turned out
much better, because of the way Pratt structures the technique so neatly. So
the key feature of parser combinators is a neat and tidy way to structure a
recursive descent parser, and the key feature of monadic parser combinators is
to get extra neatness from Haskell’s categorical types.

~~~
pubby
I wish more tutorials started with "Parser combinators is a neat and tidy way
to structure a recursive descent parser" because I 100% agree. Mostly, my
disagreements are how parser combinators presented and taught, obscuring the
core ideas with a focus on language features and trivialities.

To me, the monad/applicate stuff is a red herring. It's mostly used to
simulate imperative sequencing. e.g. the Haskell code `Person <$> parseName
<*> parseAddress` would be `return Person { parseName(), parseAddress() };` in
C. There's a few tricks but it's not crucial to the parser combinator idea and
doesn't help readability.

------
fuckface123
gracefully handling errors and giving useful output is the hardest part of
writing parsers, why does everybody skip over that part?? :)

~~~
lalaithion
You can get parsing stacktraces by changing the definition of Parser to

    
    
        data Parser a c = Parser { runParser :: String -> Either [c] (a, String) }
    

and adding a combinator

    
    
        withContext c p = Parser \s -> case runParser p of
          Left stackTrace = Left (c:stackTrace)
          Right x = Right x
    

This allows you to write stuff like

    
    
        number = withContext "parsing a number" $ ...
        addition = withContext "parsing addition expression" $ ...
        expr = withContext "parsing a mathematical expression" $ ...
    

and combine that with a technique that keeps track of where you are in the
string when failure occurs, you can pretty print that to something like:

Failed to parse!

    
    
        2 + 34.0O4
                ^

While: parsing a number

While: parsing addition expression

While: parsing a mathematical expression

~~~
mathgladiator
So, one of the key things that I see is that the constructed abstract syntax
tree must have the raw tokens within it so all other layers have full
awareness.

I feel the criticism is valid because most parsers in production are done via
hand without tooling or fancy techniques.

~~~
Quekid5
> most parsers in production are done via hand without tooling or fancy
> techniques.

What? Do you have any data you'd like to share?

I'm given to understand that e.g. the C++ compilers usually have a hand-coded,
but AFAIUI that's mostly due to the complexity of actually parsing it (and
fitting that into anything other than just raw code).

~~~
mathgladiator
only anecdotal and observations over the years

[https://www.drdobbs.com/architecture-and-design/so-you-
want-...](https://www.drdobbs.com/architecture-and-design/so-you-want-to-
write-your-own-language/240165488)

A common theme is that the parser generator does not provide you the tools to
write the high quality error messages. Having used ANTLR and other tools, I
believe it now that I'm trying to ship a real language.

~~~
Quekid5
You're talking about parser _generators_... the article is about parser
_combinators_. Fully agreed that parser generators are often very limited and
often require hacks to do anything non-trivial.

(IME MegaParsec basically solves most of the issues.)

