

Jison: Bison in JavaScript - mnemonik
http://zaa.ch/past/2009/12/28/jison_an_api_for_creating_parsers_in_javascript_/

======
scottdw2
I'm going to start a flame war here... I guess.

The thing about bottom up parser generators, is that they really aren't all
that useful.

There's a dirty little secret in compiler development: most "real world"
compilers use hand written top-down recursive decent parsers. They don't use
Bison.

The classical argument in favor of using bottom up parsing is that it can
recognize a larger class of grammars and is often more efficient than top down
parsing.

This, however, ignores one huge point. Parser generators are leaky
abstractions. They almost always suck at error recovery. This is particularly
true for bottom up parsers because they don't have a lot of contextual
information. Error recovery is extremely important for several reasons:

1\. Parsers with bad error recovery give shitty error messages.

2\. Things like intellisence, which always operate on syntactically incorrect
code, won't work without good error recovery.

Some people have done research on automatically injecting error recovery into
bottom-up parsers. These systems, however, are always based on heuristics, and
supporting one set of scenarios well always leaves another set of scenarios
not supported well. Those are exactly the kind of situations where you need
humans to tune things.

The problem with parser generators is that their input languages are not
turing complete (by obvious necessity). That means when the parser generator
does something wrong, you can't always fix it. Most abstractions in computer
science are not like this. They involve one turning complete programming
language being compiled down into a lower level turning complete programming
language. When abstractions leak, you can always code around the leakage.

But, with a parser generator, you can't do this. If you want to implement
robust error recovery, then you have to add explicit productions to the
grammar to handle them. That becomes extremely tedious extremely quickly.

That means to create a real parser, for use in interacting with humans, you
need one of two things:

1\. A customized parser generator, tuned for the exact parser you are
generating.

2\. A parser written by hand.

Item #1 isn't really practical. That leaves number 2 as the obvious best
choice.

Writing bottom up parsers by hand is a giant PITA. That's why people invented
parser generators, so they wouldn't have to spend an eternity hand crafting
tables.

Writing top down parsers by hand, however, is really easy and intuitive. You
also get full support of most code browsing tools and debuggers, so they are
easy to change, even as the grammar gets large. They are also easy to read.

Finally, because they are written in a Turing complete programming language,
you can easily (and with minimal code) inject what heuristics you need, in
exactly the place where you need them.

Because of this, hand written recursive decent parsers are almost always the
best engineering choice, despite all the other tradeoffs.

Despite all the proofs of the superiority of LR parsers over LL parsers, I
would take a human, armed with a specific grammar and a for loop, over Bison
any day.

~~~
Dobbs
You mention specifically that it this is true for compilers. What about
smaller applications. From my browsing of the FreeBSD tree most programs that
have a syntax use yacc/lex to parse the input. Awk for example, among others.

As you are much more experienced in this area, any opinions?

~~~
scottdw2
You should be weary of using Lex for unicode, because the tables can get
large.

As far as Bison goes, I guess that's up to you. My preference is against it,
mainly because I don't think it buys you much (if the grammar is small, the
code for the parser should be small to).

The biggest driver should be user experience. If you care a lot about how
errors are reported (i.e. you want to avoid cascading errors, while trying to
report as many errors as possible up front) , then I say you should definitely
not use bison.

If you don't care about error reporting much, then its really just based on
your personal preference.

------
pufuwozu
If only I found this yesterday! I was looking for a parser generator that
output JavaScript last night. I settled on:

<http://jscc.jmksf.com/>

And started writing a small Lisp programming language this morning. I like
this Jison grammar syntax a lot better so I think I'll have to start porting
it.

Thanks for posting this!

------
liuliu
The work here is actually useful in real world. For example, if a jison
frontend deployed as a grammar checker for code snippet share site or more
practical a SQL query web-interface, it can provide a more responsive
interactive input area which eliminates the communications to sever-side for
grammar errors.

