Hacker News new | past | comments | ask | show | jobs | submit login

In a PEG you operate under the expectation that all rules will be tried exhaustively, so if one fails you just move on to the next, and if they all fail you drop back a level of recursion.

Thus you can have rules like:

"if <if-expr>" where if-expr can contain "<statements> else" or "<statements> if" or "then <statements>", where statements is a set of predefined keywords, another set of PEG rules, or a regular expression matcher.

And then you can just expect that they'll all be tried at some point.

The main issue with this, of course, is the one you raised, with endings and creating rules for delimiting new blocks. Although it's possible to define a good set of rules, it's still hard to distinguish between a continuation of a statement and the start of a new one, and in most cases you won't get a useful syntax error message using a naive PEG system like the one in PEG.js, because it'll just drop all the way to the top level and then fail at the first character without telling you how far/deep it got before the syntax broke.

So in practice having end symbols is still useful to make the grammar operate sanely. PEG can also be used to execute a relatively flat pass that mostly produces tokens and simple expressions, while other methods are applied to the resulting tree to get the complete structure. This is an approach which I find a little less mind-bending than trying to cram everything into PEG rules, since you can use varying methods of parsing as the situation dictates.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: