Hacker News new | past | comments | ask | show | jobs | submit login
Pika parsing: reformulating packrat parsing as a dynamic programming algorithm (arxiv.org)
40 points by danny00 30 days ago | hide | past | favorite | 6 comments



Abstract makes this seem interesting, but the title is odd. packrat parsing is already a dynamic programming algorithm. The reformulation is to parse bottom-to-top, right-to-left.

bottom-to-top is fairly common for parsing (e.g. pretty much any LR based parser will do so), but right-to-left is not something I've encountered before. The very right-most terminal is the first thing matched with this technique.


I'm also a little unnerved by the author's dismissal of Earley as not practical, when Aycock & Horspool (who are big in this field) literally have a paper called "Practical Earley Parsing", which does exactly what it says on the tin.

The other thing I wasn't a fan of is he doesn't give an recursive descent (RD) algorithm, even though it's trivial to describe RD using a pair of functions (parse-alt; parse-rule) over a grammer. When RD is expressed as a grammar, almost every parsing algorithm falls out as a "memoization" hack for one part of the algorithm, or another.


I legit think that diagonalization is like the only mathematical trick in existence. Dynamic programming? Diagonalization. Reinforcement learning? Diagonalization. Ycombinator? Diagonalization.

https://twitter.com/shachaf/status/1183934586775957504


I'm not sure I understand diagonalization, but it does seem that the incompleteness theorem, halting problem, and Russell's paradox are essentially restatements of the liar paradox (which dates to antiquity.)

Which always puzzled me, because I was like "so what, is it supposed to be somehow surprising that circular dependencies can create the liar paradox (or some other infinite inference loop?)"


Cantor's diagonalization argument shows that uncountable sets exist, and they are different from countable sets. That was sufficiently counterintuitive that some mathematicians rejected it at first, until they warmed to the idea.

Your circular dependencies are metacircular rather than circular. That corresponds to circular if you can prove that truths in the system and its metasystem are equivalent, but that's often not so easy to show rigourously. (In other words, so you can be really really sure you haven't missed a subtle loophole). It's tempting to assume they are equivalent by intuition, but that assumption isn't correct for all logic systems.

Like many good theorems, the core idea is simple once explained.

G├Ádel's incompleteness theorem required the axioms and logical deduction rules of the arithmetical logic used at the time to be encoded using that very same arithmetic, and to prove that's possible, that it works, and it's consistent. That takes some rigour. Until it was done, it wasn't obvious you could definitely prove that it works, because system-metasystem relationships often have some subtle twist.

My favourite example of a subtle twist like that is Skolem's Paradox. Which relates back to diagonalization, and shows that maybe countability and uncountability aren't so simple after all.


I like the grammar, which makes it much easier than PEG. But there really needs to be a C implementation sooner or later.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: