
A quick intro to writing a parser with Treetop - aarongough
http://thingsaaronmade.com/blog/a-quick-intro-to-writing-a-parser-using-treetop.html
======
barrkel
This page would be more readable if there wasn't a huge black bar overlaid
across the middle of it.

~~~
aarongough
Would you be able to tell me what your screen resolution is?

I decided to go with the bottom-attached menu because it makes navigation
painless, on the resolutions I've tested the site with the difference between
the above-menu area and the menu area made reading seem fairly natural.

~~~
adbge
Even at 1680x1050, I find attached menu's really grating. I think it's the
lost screen real estate because the larger they are, the more irritating I
find them. On the other hand, Twitter went ahead and implemented a top
attached menu, so maybe I'm in the minority.

Anyways, back on topic, do you (or any HNers) have any recommendations on how
to get started with formal grammars and parsers? Is there a canonical
introductory text?

~~~
silentbicycle
Niklaus Wirth's _Compiler Construction_ (free online, <http://www-
old.oberon.ethz.ch/WirthPubl/CBEAll.pdf>) has a good intro to parsing, though
for better or worse it's skewed towards recursive-descent parsing and has a
"hit the ground running / focus on theory later" style.

After that, you could dig deeper with Andrew Appel's _Modern Compiler
Techniques in {ML,C}_ books. The ML one is better, IMHO. Those cover other
methods (LL, LALR, SLR, etc.) in greater detail, and I'd also recommend either
for learning compilers in a heartbeat. (Appel's _Compiling with Continuations_
is also excellent, but doesn't cover parsing.)

Following that, Dick Grune's _Parsing Techniques: A Practical Guide_
(<http://www.few.vu.nl/~dick/PTAPG.html>) is a good reference...and thorough.
If you're reading it to learn the basics, it might seem a bit dry, but I think
it's at a sweet spot between depth of coverage and deference to the extensive
bibliography. I have the second edition; the first is free online. Not sure
about the differences, but the coverage of fundamentals probably haven't
changed much. Also, while the other two are compiler books with chapters on
parsing, this one is 100% parsing, and gets to a lot of interesting parsing
algorithms (e.g. Earley parsing) that don't usually get much love in compiler
texts.

Some people will also recommend the Dragon book, but I think those three will
be more helpful. I haven't read the new edition, but the old seems drier &
less thorough than the Grune book, less direct than the Wirth book, and less
modern than the Appel book.

Also: For learning lex and yacc, the intros in the _4.4BSD Programmer's
Supplementary Documents_ ("PSD", included with OpenBSD and probably the other
BSDs, and not hard to find online) are hard to beat. The O'Reilly _Lex and
Yacc_ book somehow manages to be roughly ten times as long yet less
informative.

And if you have full control over the syntax used, S-expressions (Lisp), RPN
(Forth), or Lua/JSON will let you dodge the issue of parsing entirely.

~~~
adbge
That is likely the most thorough, informative response I have ever received. I
almost qualified that with "on the internet", but then I realized, hell, it's
probably more informative than any response I've ever received _anywhere_.

I recently saw a news piece about a philanthropist helping people in poor
African villages by simply building wells. The tribesmen were so grateful, one
said "I wish he lives one hundred and fifty years so that he can continue
helping people like us."

Well, Scott, thank you. I hope you live one hundred and fifty years.

~~~
silentbicycle
There's some real gems in the archives here. For starters, see
<http://news.ycombinator.com/item?id=835020> . More generally, try googling
_site:news.ycombinator.com keywords_.

And, thanks. :)

------
febeling
This example uses Treetop. The same variety of parser, PEG (parsing expression
grammar), but not code-generating, but dynamically defining ruby code is
Citrus. It was really a pleasure to work with. The difference is that you
don't need a preliminary compile step in a rake file e.g., which I like better
for a language as dynamic as ruby.

<http://github.com/mjijackson/citrus>

~~~
aarongough
The intermediate compilaton step is actually optional when using Treetop. You
can see in the code in the article that Treetop is directly loading and
interpreting the grammar at runtime. (Of course there is always going to be
some compilation 'behind the scenes', but there's no need for it to be
explicit.)

Citrus looks interesting! From what I can see the PEG syntax used by Citrus is
_very_ similar to Treetop. I'll definitely check it out more later, I'm
particularly interested in performance difference between the two.

~~~
febeling
In fact, you're right, it doesn't necessarily write a source file. I missed
that last time when I looked. But Treetop will still create ruby code, write
it into a string and then eval that. I didn't find that approach really
elegant.

~~~
aarongough
I do agree to an extent. I'd be interested to see if Citrus is faster... I'll
write up a test in a week or two and we shall see!

------
chipsy
My favorite right now is PEG.js: <http://pegjs.majda.cz/>

------
RickHull
ctrl-f grammer

Really?

~~~
aarongough
Fixed. It must have been a late night!

