
Language.js - A fast PEG parser written in JavaScript - ryannielsen
https://github.com/tolmasky/language
======
tolmasky
This just got pushed literally today as I gave a presentation on it at
CappCon, wasn't expecting it to get posted here, README and so forth are
coming.

EDIT: OK, there is a README up, but again, this is not officially "launched"
or anything, I was planning on putting the docs and website together this
week, so check back later.

~~~
ryannielsen
Sorry for the pre-announce, then! I was intrigued by the tweets of others at
CappCon, and this is the first I've ever heard of a PEG... it felt like
something others on HN would like to know about. I'm very interested in
language.js and Obj-J 2.0, and look forward to hearing more about both.

------
samstokes
The "naughty OR" operator, and placing syntax error nodes in the syntax tree,
are awesome ideas. I hope someone steals them and adds them to Parsec :)

------
geuis
Can someone summarize what PEG is? I read through the Wikipedia page but can't
make heads or tails of it. What is this used for?

~~~
tptacek
PEG is a different way of specifying grammars and an different process for
parsing them. If you're used to Bison/Yacc or LALR parsers modeled on Yacc
(like Racc), the big things you notice with any PEG parser are:

* You automatically get "EBNF"-style operators like "zero-or-more" or "one-or-more" instead of having to specify recursive nonterminals and epsilons.

* You typically don't need a separate lexer; PEG parser productions resemble regular expressions.

* PEGs use prioritized choice to deal with ambiguities such as dangling-else. PEG grammars are unambiguous.

PEG parsers are much easier to build and work with than Yacc-style parsers.
The learning curve on PEG is also way, way shorter than for shift-reduce
parsers. You should probably be using PEG parsers whenever possible now.

~~~
haberman
For a counterpoint:

* you can get "EBNF"-style operators using traditional LL parsers like ANTLR.

* Likewise, top-down parsers can have integrated lexing also.

* The downside of prioritized choice is that you don't get any insight into what inherent ambiguities your grammar has. The dangling-else ambiguity in C (for example) is a real ambiguity that you have to tell your users about. Prioritized choice hides ambiguities and gives you no hint when you develop your grammars where the ambiguities are.

* PEG-based parsers are less efficient than LL parsers. LL parsers are O(n) in time and O(d) in space (where n is the length of the input and d is the depth of the nesting). To parse a PEG in O(n) time the space complexity becomes O(n) (this is a packrat parser). To parse with O(d) space, time complexity becomes O(n^d) (as with lpeg). And the constant factor for LL parsing is much lower than with packrat parsing.

I think the future is LL. When the tooling gets good enough with LL, I think
that the primary motivation for PEGs (that they are easy to use) will
evaporate.

~~~
tptacek
I knew someone would call me on this and chose my words carefully --- "LALR
parsers modeled on Yacc" --- but am glad for the response. ANTLR is a pain in
the ass to integrate; you can get a totally respectable PEG parser in 2k lines
of C. What would you use for LL parsing?

~~~
haberman
I'm working on making my ideal tool: <http://www.reverberate.org/gazelle/>

It's been a long time coming, but I believe very strongly in it.

"pain in the ass to integrate" -- I can sympathize. I'm working tirelessly to
make this as easy to integrate as regexes are in Perl, Python, Ruby, etc.

------
aeontech
Great idea, but... ungoogleable name choice. Definitely look forward to
hearing more about this though.

~~~
kree10
Nondescript names are de rigueur for JavaScript-related projects. See also:
prototype, node, underscore, backbone, processing, reflection, glow...

------
abraham
From Google Chrome after following the link on GitHub:

"The website at languagejs.com has been reported as a “phishing” site.
Phishing sites trick users into disclosing personal or financial information,
often by pretending to represent trusted institutions, such as banks."

~~~
tlrobinson
Odd, it's just a parked GoDaddy page.

------
notJim
Could really use a readme. Languagejs.com goes to a GoDaddy landing page.

I guess this is what PEG is:
<http://en.wikipedia.org/wiki/Parsing_expression_grammar>

------
sjs
Just a note: Francisco didn't release this. He just pushed it to Github and
others have posted it here. I don't think he mentioned it on twitter even,
it's a glimpse of something he's hacking on.

------
ltamake
Pretty cool stuff. Shame I don't have much of a use for it. :(

------
aashay
No README?

