

Ask HN: Which one is better for writing a SQL parser? - liuliu

It seems that there are two choices: Lemon Parser Generator and YACC.<p>PostgreSQL uses YACC and SQLite uses Lemon.<p>My usages should be multi-thread, embedded parser for much simpler SQL syntax (a SQL subset).<p>Which one is better for this?
======
jerf
Can you just snag one of those parsers? It depends on just how "sub" your
"subset" is, but for something like this, there's nothing like a snatch-and-
grab if you can get away with it. (I don't recall SQLite's licence, but
PostgreSQL's is very unlikely to be a problem for you.)

Even if it gives you something a bit too elaborate, it can be easier to filter
your way back down to what you are looking for than to rewrite something up
from the bottom. (One easy strategy is to just scream "SYNTAX ERROR!" when you
see something that the grammar knows, but you don't understand. Since you'll
be attaching your own object model/structs to the parser anyhow, this is
virtually trivial.)

------
vbar
A few people (including me :-) ) wrote SQL parsers in ANTLR. I'm quite
satisfied with the result and never heard anyone to complain ANTLR wasn't
suitable for the task...

~~~
jaskew
I would second antlr. Get the book, too, if only for the educational value.

[http://www.pragprog.com/titles/tpantlr/the-definitive-
antlr-...](http://www.pragprog.com/titles/tpantlr/the-definitive-antlr-
reference)

------
gcv
Last time I wrote a parser, I used yacc/bison. It wasn't particularly
difficult to use, although I never stress-tested the language I put together.

Anyway, the field of parser generators seems to have grown recently. I had
never heard of ANTLR before, nor Lemon. Thanks for the pointers in this
thread, I'll be sure to investigate them. There is also Parsec (for Haskell),
and ocamllex and ocamlyacc (for OCaml). If you use C++, look into
Boost.Spirit. I use tinyjson in production, which uses Boost.Spirit and it
works quite well.

------
bayareaguy
I've used all of those at different times and for simple stuff they will all
work. I'd recommend picking whichever of YACC, Lemon or Antlr builds easily on
your platform and is the least displeasing to others you expect may need to
trace through the code.

Oh and if for some reason you end up using flex somewhere, don't forget to
specify the "-8" option to ensure it generates an 8-bit clean scanner.

~~~
liuliu
one may stuck me to use yacc is that can yacc make a func like : query_t
parse(const char* str) and do the parse part thread-safe?

~~~
bayareaguy
Recent YACC's such as Bison support pure reentrant parsers. Here's a
reference: <http://dinosaur.compilertools.net/bison/bison_6.html#SEC56>

~~~
flashgordon
make sure you check the licences... different licences apply to yacc and bison
generated parsers...

------
flashgordon
Hang on is your sql LL? in which case writing a top down parser is not too
difficult.. especially if you use flex for your scanner...

------
Hoff
The LLVM Kaleidoscope example might be interesting to you.

