Hacker News new | past | comments | ask | show | jobs | submit login
Principled Procedural Parsing (2019) [pdf] (norswap.com)
18 points by todsacerdoti on May 14, 2021 | hide | past | favorite | 12 comments



Takes a bit before there is any mention of the thesis's contributions. Jump to section 1.3 on page 10 for a list of goals, then section 1.4 on page 13 for the approach to achieve them.


There is even an explicit section 1.6 "Overview & Contributions", but it's still difficult to discern what specific, scientific contributions were made in this dissertation. The most fruitful place to find clues is in the last paragraph on page 21 which mentions the previous publications “Parsing Expression Grammars Made Practical”, “Taming Context-sensitive Languages with Principled Stateful Parsing” and "Procedural Shift-Reduce Parsing”. So we would probably have to read these papers as well to find out what contribution is supposed to have been made here.


The tool, Autumn, is on GitHub. It is described as "A Java parser combinator library written with an unmatched feature set".

https://github.com/norswap/autumn


Unmatched in the land of Java. AFAICT from the examples it looks comparable to LPeg in usage and feature set. Unsurprisingly, its based on PEGs according to the documentation.

AFAICT, Autumn seems to support some kinds of memoization as a performance enhancement, and has a utility routine to synthesize left-associative rules. But LPeg has match-time captures which let you build context-sensitive grammars inline, such as for ubiquitous TLV (tag-value-length) formats. In terms of sheer utility the latter is far more powerful, IMO.

Nonetheless, Autumn is definitely one of the better parser combinators I've seen. Both Autumn and LPeg stand out because of how easy they make it to specify inline transformations. Most parser combinators make it easy to match and generate an AST, but you're literally given a raw match tree that you then need to transform into a data structure usable in your program. Not difficult, but neither is hand-rolling a parser. The point of combinators is to make this stuff easier and more declarative. Autumn (AFAICT) and LPeg actually help to simplify the other 50% of the work, not just the first 50%.


There also appears to be a rust version.

https://github.com/xurtis/autumn


> Autumn is a parser combinator framework written in Java that can be seen as an extensible superset of PEG. Autumn’s set of built-in parsers can be extended by the user2 using arbitrary Java code.

The user manual is here: https://github.com/norswap/autumn/blob/master/doc/README.md


Looking at the bibliography used it surprises me that CocoR https://ssw.jku.at/Research/Projects/Coco/ is missing, for me so far CocoR is the Lua equivalent for parsers, simply, small and relatively easy to understand, generates recursive descent parsers mostly LR(1) but we can have LR(k) when needed, it uses first and follow sets to detect ambiguities and has a simple EBNF grammar description.


Did you mean LL(1/k) and not LR? Looking at the site it was mentioning ll parsers


Yes, Coco/R generates parsers for LL(1) grammars. The focus of the present publication is apparently on parsing expression grammars and parser combinator frameworks instead.


Thank you for correcting my mistake, indeed LL(1/k).


He said recursive descent, so he meant LL I guess.


+1. I actually missed that in the comment. Also yeah I really hope it didn't sound like I was nitpicking as I was worried I might not have paid too much attention to detail to the page.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: