
Ask HN: Does anyone still use Lex/Yacc? - stepvhen
I often see suggestions to use lex&#x2F;yacc (or flex&#x2F;bison) when discussing
writing compilers or other tools that need parsing, but I don&#x27;t see them in
much use in the wild. Are there any good reasons to not use these tools, and
instead roll your own? Or if people really are using these tools, what are
some legitimate projects that use them?<p>I am writing a toy ledger (a smaller, &quot;suckless&quot; version of
http:&#x2F;&#x2F;ledger-cli.org ), and I am curious if it is unwise to start working
seriously with flex&#x2F;bison, like if I am going to run into problems as my
program becomes larger.
======
dalke
I do see lex/yacc used in the wild. It's in the RDKit (a cheminformatics
package) to parse SMILES and SLN. That's obscure, I know, but that's the field
I'm in. I can't find any use in any of the other packages I've downloaded,
except a copy of Boost from 2004 which is somehow hanging around.

FWIW, I prefer ANTLR, and for Python, I use PLY.

The usual issue is that it's not hard to write a parser by hand. Though on the
other hand, parser generators can tell you about ambiguities that aren't
obvious when hand-coding.

Personally, error-handling in yacc has always confused me. When I want
detailed error reporting, I often switch to a tokenizer (like Ragel) plus some
hand-written parser.

The ANTLR FAQ has the comment:

>> What do you think are the problems people will try to solve with ANTLR4?

> In my experience, almost no one uses parser generators to build commercial
> compilers. So, people are using ANTLR for their everyday work, building
> everything from configuration files to little scripting languages.

~~~
stepvhen
I say lex/yacc specifically because I am doing this in C, and I've learned
that it's important to stick to tools thatare made to work together.

I am unfamiliar with File I/O and text processing in C overall, and so writing
a parser by hand seems like more work than using BNF grammars that I am
already pretty familiar with.

~~~
dalke
ANTLR3 emits a backend in C.
[http://www.antlr3.org/api/C/index.html](http://www.antlr3.org/api/C/index.html)
. However, ANTLR4 does not. I have no experience with this backend.

Another example is LEMON, developed and used for SQLite, at
[http://www.hwaci.com/sw/lemon/](http://www.hwaci.com/sw/lemon/) . It is only
a parser. You can have a hand-written tokenizer, use lex, or an alternative
like re2c or ragel. I have seen LEMON praised a few times on HN
([https://news.ycombinator.com/item?id=10295087](https://news.ycombinator.com/item?id=10295087)
). I have no experience with it.

------
noobiemcfoob
I've only rolled my own when working with such a basic grammar (or such basic
aspects of a grammar) that simple regex felt faster than dealing with actual
parsing.

That said, most of that was at a time when I was much less comfortable with
lex/yacc, so maybe I'd be more keen on them now.

