Hacker News new | comments | show | ask | jobs | submit login

FWIW I wrote 6 articles about expression parsing / Pratt parsing here:

http://www.oilshell.org/blog/2017/03/31.html

(and there is an update about the Shunting Yard algorithm, which is common but not mentioned in the OP. Ritchie's C compiler used it.)

Also, the original article is nice in some ways, but it feels like it treats all the parsing techniques as being on equal footing in practice. It feels a little like a college intro to parsing course rather than something for experienced programmers.

In reality you will see these techniques overwhelmingly:

- Hand-written recursive descent parsers, possibly with Pratt parsing/precedence climbing for expressions (Clang, GCC, v8, Lua, etc.)

- Yacc style LR grammars, but often with loads of code in semantic actions (Ruby, R, bash, Awk, early JavaScript implementations, sqlite and I think most SQL dialects)

- Occasionally you will see LL parsing (ANTLR, Python).

The rest of the techniques you'll see pretty rarely... I saw parser combinators in ShellCheck, but not in any implementations of "production" programming languages. (I looked at at least 20, most of them in C or C++.)




In my experience ANTLR is a good tool to cover most of usage, so I would not say it is used "occasionally", for me it is the first choice


Yeah ANTLR is used widely in the Java ecosystem, but I guess I am biased toward full programming languages rather than DSLs, which are usually written in C or C++.

ANTLR can technically generate C or C++ code, but IME it's very bad, and I've never seen anybody use it "for real".

Although JVM languages like Jython and JRuby don't use ANTLR; they use the same techniques as the C implementation of the language (the custom pgen language, and yacc, respectively)




Applications are open for YC Winter 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: