
LL and LR Parsing Demystified (2013) - bleakgadfly
http://blog.reverberate.org/2013/07/ll-and-lr-parsing-demystified.html
======
AceJohnny2
What about Earley's algorithm?

 _Unfortunately, given the extremely limited hardware of 1960s computers (not
helped by the lack of an efficient algorithm), the parsing of an arbitrary CFG
was too slow to be practical. Parsing algorithms such as LL, LR, and LALR
identified subsets of the full class of CFGs that could be efficiently parsed.
Later, relatively practical algorithms for parsing any CFG appeared, most
notably Earley 's 1973 parsing algorithm. It is easy to overlook the relative
difference in performance between then and now: the fastest computer in the
world from 1964-1969 was the CDC6600 which executed at around 10 MIPS; my 2010
mobile phone has a processor which runs at over 2000 MIPS. By the time
computers had become fast enough for Earley's algorithm, LL, LR, and friends
had established a cultural dominance which is only now being seriously
challenged - many of the most widely used tools still use those algorithms (or
variants) for parsing. Nevertheless in tools such as ACCENT / ENTIRE and
recent versions of bison, one has access to performant parsers which can parse
any CFG, if that is needed._

from:
[http://tratt.net/laurie/blog/entries/parsing_the_solved_prob...](http://tratt.net/laurie/blog/entries/parsing_the_solved_problem_that_isnt.html)

Featured here 6 years ago:
[https://news.ycombinator.com/item?id=2327313](https://news.ycombinator.com/item?id=2327313)

~~~
fernly
Thanks for the reference. For those like me who'd not heard of it, see [1] for
a description and links to many implementations.

[1]
[https://en.wikipedia.org/wiki/Earley_parser](https://en.wikipedia.org/wiki/Earley_parser)

~~~
AceJohnny2
I missed that haberman (author of this article) had also commented in that
thread extensively:

[https://news.ycombinator.com/item?id=2328627](https://news.ycombinator.com/item?id=2328627)

------
haberman
Happy to see this show up here again. It's one of my articles that I'm most
proud of.

I'm happy to answer any questions about it.

~~~
Shawnecy
> A planned future article will break open the black box for more details
> about the inner workings of these algorithms.

Was this article ever written?

~~~
haberman
Good question. Not so far. I can't quite see how to do it without getting
bogged down in details. The "traditional" algorithms and grammar classes
(Strong LL, Full LL, LR(0), etc) are complicated, and yet few people use
these. The ones people actually do use (LL(star), ALL(star), GLR, IELR, LALR,
etc) tend to be even more complicated.

Maybe the thing to do is to illustrate a few of the most basic algorithms by
example, just to give people a taste of what they look like.

~~~
devty
Out of curiosity, where would one find an explanation (potentially dry, long,
and complicated) of the algos you mention here?

In other words - how did you get to know them?

~~~
haberman
By far my favorite survey book on the subject is "Parsing Techniques: A
Practical Guide" by Grune and Jacobs.

Here is my Amazon review of the book: [https://www.amazon.com/gp/customer-
reviews/R17E19PSPM2UO9](https://www.amazon.com/gp/customer-
reviews/R17E19PSPM2UO9)

~~~
jbn
This book seems to be available at :
[http://dickgrune.com/Books/PTAPG_1st_Edition/](http://dickgrune.com/Books/PTAPG_1st_Edition/)

~~~
wolfgke
This is the first edition. The current edition is the second edition.

------
nickpsecurity
People interested in these things should look up GLR and GLL. Pretty powerful
when I studied them.

~~~
wfunction
Funny, I knew GLR but I had never heard of GLL. Thanks for writing this!

~~~
nickpsecurity
Welcome. :) Here's the link to first paper I saw I think:

[http://dotat.at/tmp/gll.pdf](http://dotat.at/tmp/gll.pdf)

------
CalChris
tl;dr use ANTLR

But actually, I did read this excellent article for the second time. However,
unless you are skilled in the art, you should be using ANTLR4, Honey Badger.
Terence Parr deserves a Turing award.

For parsing, most upper div compiler classes start off with CFGs, dip briefly
into LL recursive descent and then conclude with LALR. If people remember
anything, it's SHIFT/REDUCE. Unfortunately, there doesn't seem to be any
standard tools for the LL(1) equivalent, FIRST/FOLLOW tables. And as the
article shows, they aren't equivalent. Each has strengths but LALR has tools.

Except for ANTLR. Which started out as recursive descent ...

Parsing Techniques doesn't really have any competition. The compiler books,
like the compiler classes, can only give a little attention to parsing. Given
how strong the available tools are and how hard the remaining subjects are
(SSA, code generation, ...) maybe they have a point.

~~~
exDM69
There are plenty of other parser generators than ANTLR and interesting parsing
techniques that fit the bill. Not that there's anything wrong with ANTLR.

The last few time when I've been working on programming language prototypes,
I've written the parser using Parsec parser combinators in Haskell. It's super
fast and easy to use, however it's more like syntactic sugar for recursive
descent parsers than a rigorous parser generator that works for a certain
grammar class.

------
cjhdev
I've learned a lot about LALR and GLR using Bison in combination with Ruby.

Bison and Flex are available as apt packages on Ubuntu and there is heaps of
documentation.

The Ruby angle means you don't have to roll your own AST structures. Also,
Ruby's 'VALUE' type is easily passed around in Bison grammar.

example:

[https://github.com/cjhdev/slow_blink/blob/master/etc/slow_bl...](https://github.com/cjhdev/slow_blink/blob/master/etc/slow_blink/ext_schema_parser/parser.y)

------
viebel
I've copy/pasted the code of your reverse polish evaluator in python in this
interactive blog post: [http://blog.klipse.tech/python/2016/09/22/python-
reverse-pol...](http://blog.klipse.tech/python/2016/09/22/python-reverse-
polish-evaluator.html)

------
dream-on
I fail to see the mystery. In my opinion, state machines are the most
straight-forward parts of programming.

There is a learning curve with lex and yacc but if you cannot ever use these
programs then what can you really call yourself a programmer?

It's the endless search for a "new" programming language and the layers upon
layers of needless abstraction created by today's "programmers" that I find
mystifying.

~~~
UK-AL
using lex and yacc, is not the same thing as knowing how lex and yacc work.

~~~
dream-on
Very true. But for mere mortals, one thing at a time.

