Hacker News new | past | comments | ask | show | jobs | submit login

Your description of these two algorithms sounds accurate; I think your point of confusion is with the term "deterministic."

The fact that packrat parsing is based on search-and-backtrack by definition means it is not deterministic. Deterministic parsers never backtrack, because they are designed have enough information to always make the correct choice, by using lookahead.

One important distinction between the two is that a parser based on search returns the first parse tree it finds, but that might not be the only possible parse tree. If you have a well-defined order for the alternatives, then it will always return the same parse tree, but the grammar could still be ambiguous and capable of returning other parse trees also. PEGs sidestep this issue by defining away this ambiguity, and saying that the first parse tree is the ONLY parse tree -- this is the difference between PEGs and context-free grammars.

With deterministic parsers, on the other hand, the process they use to build their tables ensures that the grammar is not ambiguous. So when you get a parse tree back from a deterministic parser, you know it's the only possible parse tree for that input text.




Interesting, thank you for clearing that up for me.

Just to make sure I understand: you could view a PEG as having an ambiguity that is resolved by implementation.

It's a small and subtle difference that both LL(* ) and PEG grammars are by definition unambiguous and produce only one tree; they just accomplish it in different ways.


> Just to make sure I understand: you could view a PEG as having an ambiguity that is resolved by implementation.

I wouldn't quite put it that way, because PEGs are defined abstractly, not in terms of any particular parsing method. For example, lpeg (http://www.inf.puc-rio.br/~roberto/docs/peg.pdf) parses PEGs also, but does not use packrat parsing to do so.

> It's a small and subtle difference that both LL(* ) and PEG grammars are by definition unambiguous and produce only one tree; they just accomplish it in different ways.

Yes, though I will editorialize a bit here and say that PEG's method of defining away ambiguity makes it impossible to know when there are true ambiguities in your grammar. With PEGs, if you have:

a -> b / c;

...it is undecidable whether this is the same as:

a -> c / b;

Take a common ambiguity like the if-then-else problem (http://en.wikipedia.org/wiki/Dangling_else). This is a real ambiguity that will affect how programs in your language are interpreted. But a PEG based tool will never tell you that an ambiguity exists, because to a PEG there is no ambiguity. It gives you no help in understanding whether any of your language's productions are ambiguous to a user.


> I wouldn't quite put it that way, because PEGs are defined abstractly, not in terms of any particular parsing method.

I thought one of the core things in PEGs was ordered choice, which implies execution order. Packrat parsing is just a space vs. speed tradeoff, converting super-linear time (thanks to backtracking) into linear time with linear space (thanks to memoization). Are there PEGs that have unordered choice? (lpeg doesn't)

> It gives you no help in understanding whether any of your language's productions are ambiguous to a user.

Point taken.


> Are there PEGs that have unordered choice? (lpeg doesn't)

No, PEGs are defined in terms of ordered choice. All I'm saying is that they are not defined in terms of one particular algorithm that implements ordered choice. That's why I objected to the characterization that PEGs have their ambiguity resolved "by implementation." That's inaccurate; their ambiguity is resolved by definition, not by implementation. PEGs are unambiguous even if there is no parsing implementation.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: