
Packrat Parsing (2002) - tosh
https://pdos.csail.mit.edu/~baford/packrat/thesis/
======
fovc
Alan Kay's group used packrat parsing with a slight modification to handle
left recursion in their STEPS project. It lives in as Ohm(1).

At some point I tried to use a similar idea to build an incremental parser
generator for Emacs (2). Currently abandonware, but think it has potential!

(1) [https://ohmlang.github.io/](https://ohmlang.github.io/)

(2) [https://github.com/felipeochoa/mole](https://github.com/felipeochoa/mole)

~~~
breatheoften
Ohm is really a nice system — I’m using it to process text extracted from pdf
documents.

Lots of regular expressions were used before and without control of the
documents (which change over time) this kind of code was just unmanageably
difficult to maintain.

Being able to use negative look ahead with lists of grammar symbols provides
an incredibly large amount of power to resolve ambiguities and makes it
relatively easy to arrive at an explainable grammar capable of handling “every
sample we’ve seen so far drawn from an only partially observable language” ...

------
jws
As someone who learned parsing in the ‘80s I find Packrat Parsing to be an
interesting example of invalidated assumptions.

Working memory was on the megabyte scale in the mid ‘80s, plus or minus an
order of magnitude. Any parsing algorithm which could require memory
proportional to input size (like pack rat parsing with memorization) was
nothing more than a thought experiment.

Slide forward to the 2000’s and Packrat Parsing can be a practical system for
some scenarios.

Now from the ~2020 perspective; packrat parsing and parsing combinator
libraries are trivial to use, possibly consuming uncounted mountains of RAM,
but we have 10000 times more RAM in our machines than the ‘80s and it just
doesn’t matter if you can prove whether the upper bound memory use of your
parsing algorithm is a function of parse tree depth or input length.

~~~
vidarh
From my perspective the main objection I have with respect to packrat parsing
is that it's just not necessary for most common parsing problems. Parser
combinators is a completely different issue - parser combinators work just
fine on tiny systems.

But PEGs for most languages translate trivially into recursive descent parsers
with limited lookahead. In most languages where there's a risk of significant
backtracking there tends to be relatively simple ways of minimizing it by
relatively simple restructuring of the grammer without any reason to resort to
packrat parsing. I'm sure there are cases where it's tricky enough that a
packrat parser simplifies it substantially, but I just haven't come across any
where it's particularly compelling.

~~~
blihp
While you are correct that packrat parsers aren't often strictly necessary,
they are nice to have in the sense that you don't need to worry about
restructuring/optimizing your grammar. Just write it in the most
straightforward way and use it. This is very helpful when prototyping and you
are trying out different ideas. You can worry about optimizing it later when
you finalize your approach rather than up-front on a parser that will very
likely be significantly altered or thrown away. So the argument in favor of
packrat parsers can also be a trade-off between execution efficiency and
developer productivity rather than simply being necessary.

~~~
wahern
PEGs are _already_ so simple and easy to use that as a practical matter you
don't gain that much from relying on packrat parsing. LPeg, the standard PEG
library in Lua, doesn't implement packrat and I don't recall anybody ever
complaining about that. In fact, LPeg provides exceptionally useful
extensions, like match-time captures, that I think might be incompatible with
packrat memoization. Those features make it possible to write grammars for
things like ASN.1 BER (with a TLV syntax), which otherwise would be impossible
(or at least impractical) for PEGs to parse.

