
LALRPOP, an LR(1) parser generator for Rust - brson
http://smallcultfollowing.com/babysteps/blog/2015/09/14/lalrpop/
======
moomin
Is there any particular reason these days to favour a parser generator over a
straight-up backtracking parser combinator library like attoparsec these days?

Seems like the whole reason for LALR(1) parsers was performance, and parsing
is no longer as significant an amount of time compared to the old days, while
helpful feedback, which any parser generator approach tends to be bad at, is
at a premium.

~~~
jimrandomh
Parsing speed is no longer important inside of compilers, but it still matters
inside of text editors and IDEs. In those contexts, you have to choose between
the extra complexity of managing parser generation, or the extra complexity of
avoiding too many reparses.

~~~
wtetzner
Unfortunately, you can't really use a parser generator for editors and IDEs,
unless they're structured editors. If they're not structured editors, then the
parser has to be able to gracefully handle broken parses.

~~~
ltratt
Incremental parsing is designed for just this use case. The last major work in
the area that I know of is Tim Wagner's PhD thesis
[http://www.cs.berkeley.edu/Research/Projects/harmonia/papers...](http://www.cs.berkeley.edu/Research/Projects/harmonia/papers/twagner-
thesis.pdf)

------
Coding_Cat
It might be nice for us less formally-trained persons to include a small
description of what an LR(1) parser is. I tried looking it up wikipedia, but
the article isn't that clear either IMHO.

~~~
ChuckMcM
Here is a reasonable link - [http://blog.reverberate.org/2013/07/ll-and-lr-
parsing-demyst...](http://blog.reverberate.org/2013/07/ll-and-lr-parsing-
demystified.html)

~~~
e12e
Thank you. I found that to be a perfect refresher before reading the article
in this story.

Does anyone know if there's a plan to move rusts own parser to a rust native
tool? Now it apparently uses antlr4 along with some custom rust code:

[https://github.com/rust-
lang/rust/tree/master/src/grammar](https://github.com/rust-
lang/rust/tree/master/src/grammar)

~~~
kzrdude
That's just an alternative implementation of rust's grammar for verification
purposes. Rust's parser is entirely hand coded inside rustc (in Rust).

~~~
e12e
Ah, I thought that was something new. I remember reading something about a
manual parser a while back. Thank you for clarifying.

Would there be a benefit to moving from hand coded to something like parent
project for rustc? IIRC the rationale for hand-coding the parser was mostly
speed (and a wish to avoid external (to rust) dependencies)?

~~~
kzrdude
Seeing this bug filed today (rather interesting/weird breaking change needed
to fix it), it seems plain sanity should prefer a generated parser.

[https://github.com/rust-lang/rust/issues/28777](https://github.com/rust-
lang/rust/issues/28777)

I'm not a compiler hacker, so I don't know how to weigh it really though.

------
simplify
How do LR(1) parsers compare to PEG?

~~~
wcrichton
LR(1)-parseable context free grammars are more convenient to write than
parsing expression grammars in my experience, partially because PEGs are
completely unambiguous. PEG parsers often exist just because it's easier to
implement them. Also, LR(1) parsers and LALRPOP generally operate on token
streams and not plain strings, whereas PEG parsers need to encode lexing
within the grammar, which is a pain. So if you were writing a compiler, I
would recommend using an LR(1) parser over a PEG parser.

~~~
chrisseaton
Your points aren't incorrect, but I'd consider many of your negative points to
be positives.

I'd say PEGs are easier to write as they're unambiguous. It's easy to
understand what they do. I work on languages for a living, and I still have
trouble wrapping my head around what an LR parser is doing.

LR parsers often exist just because that's how we've done things for decades
and it's what everyone knows from their university language courses.

LR parser operate on a stream of tokens, so you have to force a distinction
between syntax and parsing, which isn't always natural and just seems like
unnecessary complication.

If you were writing a compiler, I'd recommend a PEG.

~~~
haberman
> I'd say PEGs are easier to write as they're unambiguous.

While that is technically true, it doesn't solve the actual issue of
ambiguity, it just defines it away.

If you have a case in your language where two syntax rules both match, it's
confusing to users because you have to arbitrarily decide that one of them is
correct. The best-known example of this is the dangling else ambiguity:
[https://en.wikipedia.org/wiki/Dangling_else](https://en.wikipedia.org/wiki/Dangling_else)

Sure, PEGs by definition are unambiguous, but only because they arbitrarily
decide that the first option always "wins." The language itself might still be
ambiguous, but you aren't aware of it because the parser always chooses the
first option.

Put another way, with PEGs you never know if:

    
    
        a -> b / c;
    

is equivalent to:

    
    
        a -> c / b;
    

You also don't know if there are rules that are entirely unreachable.

Besides this, packrat parsing (the PEG parsing algorithm) is significantly
more expensive than LR/LR parsing. Packrat parsing takes O(input length)
memory -- significantly more than the O(tree depth) space of LL/LR.

I talk more about these issues in my blog article:
[http://blog.reverberate.org/2013/09/ll-and-lr-in-context-
why...](http://blog.reverberate.org/2013/09/ll-and-lr-in-context-why-parsing-
tools.html)

~~~
craftkiller
Hey, thanks for the great blog posts but your blogging platform is frustrating
to use on mobile because you have the swipe left/right to change articles in
addition to code blocks that don't wrap. I'm having about a 10% success rate
on horizontal scrolling in the code blocks on chrome on android, and the other
90% I'm unintentionally loading a different article. Just something to
consider. Personally I'd drop the swipe navigation since I'm far more likely
to use a search bar or index to find posts rather than flip through like a
magazine. Thanks again, you did a better job explaining ll and lr than any of
my professors did.

~~~
haberman
Thanks for the feedback and I'm so sorry it's so frustrating. I just tweaked
my Blogger theme to just use the desktop theme on mobile -- hopefully this
will be an improved experience!

~~~
craftkiller
thanks!

------
xigency
Sounds fun.

