> PEG in Python is the latest, [...] The popularity of PEG baffles me. I guess i...

kd5bjo · on Nov 26, 2020

> Is there even a single benefit to using PEG?

In a formal sense, there are some non-context-free languages that PEG can recognize ( A^n B^n C^n is the classic example ). It’s an open question whether there are any context-free languages PEG isn’t capable of recognizing.

Fundamentally, PEG is a formalization of the common practice of writing a recursive-descent parser: It defines languages based on an exemplar algorithm that parses them instead of the generative approach taken by BNF, which defines how you can enumerate the legal strings of the language. They’re both as mathematically rigorous as each other, but approach the problem space from opposite sides.

For BNF, ambiguity is a question of whether the generation process is reversible: An ambiguous grammar has multiple paths that produce the same string which means that there’s no way to recover the path used to generate that string.

PEG is “unambiguous” because it’s not based on a generative model at all, so this definition is nonsensical. With PEG, we know, by definition, how any given string will parse but it can be hard to enumerate the strings that will parse in a given way. If PEG has an interesting notion of ambiguity (which it probably does), that’s the place to look for it.

PaulHoule · on Nov 26, 2020

PEG for one thing offers the possibility of composibility.

For a language like Python or Java it shouldn't be much harder to add an "unless(X) {Y}" equivalent to "if(!X) {Y}" than it is in LISP if the compiler was structured in such a way that you could take "Java Compiler A" and specialize it to "Java Compiler B" such that one term was added to the grammar and one AST tree transformation that writes the term.

Exclusive of POM files, import statements and other packing material that should be less than 50 lines of code if we built compilers to be extensible.

The arangodb database has a kind of SELECT statement that returns a list of unordered dictionaries which is frustrating if you want to plot the result in a pandas Dataframe. I wrote something in Python using PEG that would parse an adb query and static analyze it and determine the intended field order when definite.

I am up for any compiler technology that makes it easy to do that even if it means I don't get compilation as fast as Go.

Another issue that matters in error handling. Drools is a great rules engine on paper, but as it is composed with an external Java compiler you can often get into cases where the error messages don't make sense at all and it isn't like the Jena rules engine which is simple enough that I could get quick results looking at it in the debugger. That's a complicated example but I think language users deserve better error handling and compiler authors need better tools to do that.