
Show HN: Chevrotain – Fault-Tolerant JavaScript Parsing DSL - bd82
https://github.com/SAP/chevrotain
======
IvanK_net
I am familiar with PEG.js, which is basically the same thing, but exists for 6
years now. Does Chevrotain have any other advantages, than being about 2 times
faster?

~~~
bd82
Hello.

I am the author behind Chevrotain.

I would say the two major functional advantages (as already mentioned in this
thread) are Error Recovery and Performance.

Error Recovery online playable demo can be found here:
[http://sap.github.io/chevrotain/playground/?example=tutorial...](http://sap.github.io/chevrotain/playground/?example=tutorial%20fault%20tolerance)

Performance in the jsperf benchmark is affected by the javascript engine used.
On latest chrome (V8) and thus node.js which is the most common runtime,
Chevrotain is ±3 times faster in the provided jsperf benchmark.
[http://jsperf.com/json-parsers-comparison/21](http://jsperf.com/json-parsers-
comparison/21)

Shahar.

~~~
IvanK_net
What "Error recovery" means in your software? You usually want the parser to
accept the input when it corresponds the grammar and refuse input, when it
does not.

I think that user should modify his grammar in order to accept errors (e.g.
missing colon in your example). If you do it for him, then you must specify,
what you mean by tolerable and intolerable errors. I think your specification
may get very large and confusing.

EDIT: I just found out what you mean by "error recovery". However I still
think, that user should specify accepted errors inside a grammar, so he has
bigger control over it. Your "error recovery" can never be prepared to all
kinds of rerrors, that user may want to recover from.

BTW. PEG.js can generate a parser "object", which is directly usable in the
same JS "process". I have never seen any parser code while working with
PEG.js.

BTW2. generating the parser code can also be very useful, when you don't want
to distribute your grammar alongside with your app.

~~~
bd82
The use case of only handling valid input is a very common one, Particularly
during compiler construction.

But what about the use case of building an Editor (IDE) ? In this context it
would highly beneficial to be able to handle invalid inputs as the user would
still expect outline/navigation/formatting/.../ to work even when there are
syntax errors.

Error Recovery algorithms are heuristics, they can never solve all the
possible kinds of errors. Automatic error recovery is also not mutually
exclusive with expanding the grammar to support some "tolerable" errors, Both
can be used to improve the "robustness" of the grammar when fault tolerance is
needed.

Also note that expanding the grammar will work best for simple common errors
(missing semicolon), but what about when you need to re-sync an unknown number
of tokens and unwind the parser call stack?

Also note these "heuristics" are not "mine", they are based on heuristics used
in Antlr 3. Which is itself based on Academic studies in this topic:
Specifically:

Josef Grosch. Efficient and comfortable error recovery in recursive descent
parsers. Structured Programming, 11(3):129–140, 1990.

Rodney W. Topor. A note on error recovery in recursive descent parsers.
SIGPLAN Not., 17(2):37–40, 1982.

Niklaus Wirth. Algorithms + Data Structures = Programs. Prentice Hall PTR,
Upper Saddle River, NJ, USA, 1978.

------
ivan_ah
This looks very polished and potentially useful.

Could you point us to examples of different grammar files.

How hard would it be to make a LaTeX parser based on Chevrotain?

~~~
bd82
Thanks for the compliment.

I Still need to produce more examples (contributions are welcome :) ) There is
one bigger (but incomplete example) of ECMAScript 5.1.
[https://github.com/SAP/chevrotain/blob/master/examples/types...](https://github.com/SAP/chevrotain/blob/master/examples/typescript_ecma5/src/ecmascript5_parser.ts)
It is written in Typescript.

About LaTex: I'm not familiar with LaTex, However if I understand correctly it
is based on Tex which is not a context free language.

Meaning it can't be parsed with standard parsing tools.

See details here: [http://tex.stackexchange.com/questions/4201/is-there-a-
bnf-g...](http://tex.stackexchange.com/questions/4201/is-there-a-bnf-grammar-
of-the-tex-language)

