
Yacc is dead: An update - Autre
http://matt.might.net/articles/parsing-with-derivatives/
======
dasht
People should please take away from this paper the central claim that they are
offering up a fun, easy, and startlingly practical way to implement a parsing
library in your favorite language / environment. It is but a few hundred lines
in some functional languages. Scarcely importantly more in other languages. It
is (per their plausible claims) a nice alternative to trying to muck up some
dubious approximation using your regexp engine.

The "YACC is dead" claim of the original paper, and of this follow-up, is not
so much "this is so good, everyone will stop using YACC". Rather, its that
people don't use YACC when they ought to because it is inconvenient -- but
this derivative-based parsing hack is convenient in important ways that YACC
fails to be.

------
jorgeortiz85

        Over the past year, even in the little time we've had to
        work on the paper, we've learned a lot more about parsing
        with derivatives.
    
        In the week after the community found it, you all taught
        us ten times more than that. Thank you!
    

Why isn't all computer science research done like this?

~~~
mattmight
/article author

In short, there is a paranoid mentality in academia that if people let their
ideas go before formal publication, they'll be "scooped."

I'll admit to falling prey to that mentality too.

This experiment has made me rethink a lot of my assumptions.

I'd say that, in general, most papers wouldn't elicit a community response
like this.

I think the fact that this paper was on parsing, a topic accessible and
important to many, helped fuel the interest.

I bet if I arXived the rest of my rejected papers (all on static analysis of
dynamic languages), there wouldn't be much of a commotion.

But, I'm intrigued enough to give it another shot. I'll write a blog post
about another rejected paper in the near future, release the draft on arXiv,
post the reviews and see what happens.

If it happens again, maybe "naked science" is a realistic proposition.

~~~
knowtheory
You also picked a pretty troll worthy title as well (irrespective of content).

We could build a drudge report for CS, but i some how don't think that'd be
useful to the community :)

(i found the paper interesting and useful though! reading through the scala
has been instructive)

~~~
mattmight
Too true about the troll-worthy title! I'm sure that helped.

We never got around to picking a more "academic" title, so we put it on arXiv
under the working title.

------
hsmyers
Claim strikes me as a bit premature given the amount of Yacc based code there
is in the world busy producing away day in and day out. If the algorithm is
what it is said to be and I've no reason to doubt that, then at some point we
will all be blessed with a Yacc replacement. Since I've always found Yacc to
be a PIA, I certainly won't miss the old regime, but since there is nothing to
use at the moment, I'm going to hold off a bit...

~~~
marssaxman
I have found hand-written parsers to be easier to write and easier to maintain
than parsers generated by Yacc or its descendants.

------
silentbicycle
Discussion for the previous post:
<http://news.ycombinator.com/item?id=1948048>

------
erikb
I nearly don't understand anything of this. It sounds cool, though.There is so
much math and lisp...

But I want to learn and if possible even help. What do I have to learn to
understand this and maybe implement it in my own languages?

~~~
zohebv
You need to know

1\. Context Free Grammars (essentially regular expressions that can be
recursive)

2\. Two kinds of parsing techniques LL and LR. LR is much harder to implement.
LL is straightforward recursive descent. You don't have to actually study LR
parsing.

3\. Tools like yacc (LALR parser, weaker form of LR but more efficient) can be
used to generated to emit parser code in C or your favorite language. However
parser combinator libraries are preferred over code generators. These
libraries are typically easier to write in functional languages but tend to be
LL parsers that don't handle left recursion well.

4\. This paper proposes a technique that lets you use left recursion in a
parser combinator library. So, the only reason you would now use yacc is
parsing speed.

~~~
zerothehero
> So, the only reason you would now use yacc is parsing speed.

"speed" means a couple different things. Someone might say the only reason you
would use C++ rather than Python is speed -- i.e. you can always get within a
constant factor with Python. Ignoring other aspects of the languages, let's
call that true.

But Russ Cox is saying that you would also use yacc when you want a O() bound
on the parsing algorithm (a linear algorithm). That is, the technique they
advocate will produce parsers more than a constant factor slower than yacc.

Frankly I don't understand their rebuttal at all. The rebuttal is that it
doesn't matter for practical purposes? That doesn't sound like computer
science to me.

And practically speaking, it _does_ matter, because plenty of people write
exponential time regexps and don't know it, and it will blow up in their face
on large input.

