
ANTLR 4.6 Released - carbocation
http://www.antlr.org/download.html
======
chubot
I'd be interested see pointers to ANTLR-based languages people are developing.
I found it to be a powerful tool, but also limiting.

As far as I know, ANTLR v4 is even more of a "framework" than v3, in that it
dictates your program structure and always outputs a full parse tree. It
appears to have lost the ability to verify the value of k for LL(k) grammars,
due to a more powerful underlying algorithm.

I think that simplifies it for most use cases but makes others hard or
impossible.

~~~
chrisseaton
I work on implementing languages and I often need new parsers. Either for full
languages or little embedded languages or expression syntaxes of various
kinds.

I often think 'right, this time I'm going to write a really nice little Antlr
grammar and do it properly'. But every single time I regret it because there's
always some complexity that means things don't fit into the Antlr model, and
making things work the way Antlr works means more complexity than just writing
a simple lexer and parser by hand.

My most recent example is that Ruby strings can't always be safely converted
into Java strings, but Antlr wants to parse Java strings only and not a
byte[]. So I had to copy my Ruby byte[] into a Java string a character at a
time and just use the character as numbers pretending that they had no
encoding.

That's just one example. It's always some other new problem that means it's
more ceremony to use Antlr than the tool saves me later one.

Despite lexing and parsing being an entire sub-field of computer science, to
be honest I'm not sure that in practice lexing and parsing are really problems
that are so complex in the first place that a tool is ever needed.

~~~
chubot
I have this "meta-language" rant brewing for my blog[1], and I feel like you
took the words out of my mouth!

Are you saying that ANTLR makes some unwarranted assumptions about Unicode?
Does that just depend on the generated Java code, and would it be different
for generated parsers in C++? I don't know the JVM very well, but my
understanding was that the JVM makes some Unicode assumptions that aren't
always appropriate.

The last "rant" was here, about meta-languages only being suitable for toys:
[https://news.ycombinator.com/item?id=13040682](https://news.ycombinator.com/item?id=13040682)

vidarh also responds with his problems parsing Ruby (a "real" language):
[http://hokstad.com/compiler](http://hokstad.com/compiler)

The blog is based on my experience in writing a parser for bash. I ported the
POSIX shell grammar to ANTLR, but it fell ridiculously short of being a
production quality parser.

I believe it's essentially impossible to write a bash parser with ANTLR. But I
don't hear about any alternatives. All I heard is yacc/bison for bottom up
parsers, and ANTLR for top-down. Shell needs a top-down parser because it's an
interactive language (the PS2 prompt) and for completion. But it doesn't
appear there is any other "serious" meta-language for top-down parsers other
than ANTLR? It certainly is the best documented and longest-lived, but I am
surprised how far short it falls for many tasks.

Part of the rant will be a survey of "real" language parsers (i.e. take the
top 20 TIOBE language implementations) and see that they almost all use hand-
written parsers, or bespoke parser generators. For example, Python has its own
top-down parser generator in the tree, pgen.c.

[1]
[http://www.oilshell.org/blog/2016/11/20.html](http://www.oilshell.org/blog/2016/11/20.html)

~~~
raiph
> it doesn't appear there is any other "serious" meta-language for top-down
> parsers other than ANTLR?

Have you explored Perl 6 and its Rules sub-language and engine?[1] The Rakudo
Perl 6 compiler is written in Perl 6 using Rules.[2]

[1]
[https://en.wikipedia.org/wiki/Perl_6_rules](https://en.wikipedia.org/wiki/Perl_6_rules)

[2]
[https://github.com/rakudo/rakudo/blob/nom/src/Perl6/Grammar....](https://github.com/rakudo/rakudo/blob/nom/src/Perl6/Grammar...).

~~~
chubot
I have heard of this feature of Perl 6, but not used it. One obvious
difference is that ANTLR is language-agnostic in that you can generate parsers
in C++ or Python too, while I assume Perl 6 doesn't have that functionality.
So that limits its appeal.

(Although honestly ANTLR is pretty skewed towards Java; the generated code is
kind of like Java-in-Python or Java-in-C++.)

I'm curious what algorithm Perl uses to match grammars. ANTLR uses a few
algorithms, like the one to generate lookahead tables for LL(k), and then the
LL( _) algorithm, and apparently a new one with ALL(_ ) in ANTLR v4.

------
CalChris
Nice to see C++ and Go in 4.6 mainline. I've been stuck on ANTLR 3 in order to
use C++ and it's great to see the new target finally mainlined. Go is good and
I'd like to see a Rust target sometime soon.

~~~
nonsince
At the risk of sounding inflammatory, what would be the point of using a
parser generator over a powerful applicative-style combinator library like
[https://github.com/Marwes/combine](https://github.com/Marwes/combine) ? I've
used both in the past and personally massively prefer parser combinators, but
I'd be interested to know what the benefits are of parser generators.

~~~
Matthias247
I think one advantage of things like ANTLR is that the grammer is described in
an abstract fashion and that you can then generate parsers in multiple target
languages with it. With a parser combinator library you are only targeting a
single language. I also think that for project outsiders and beginners it's
easier to look at and understand a grammar instead of a parser in combinator
style.

I personally have worked with parser generators (Irony for .NET) as well as
parser combinator (fparsec) and enjoyed both a lot.

------
clumsysmurf
Just curious if anyone has used ANTLR on Android lately; I recall it was easy
to run out of stack because the runtime relies on very deep recursion.

Also I think there were some dependencies on AWT or Swing for debug console.

[https://github.com/antlr/antlr4/issues/744](https://github.com/antlr/antlr4/issues/744)

------
NicolaiS
More detailed changelog:
[https://github.com/antlr/antlr4/releases/tag/4.6](https://github.com/antlr/antlr4/releases/tag/4.6)

------
rombix
Nice to see Swift added to the list of targets. Any idea about the performance
of generated parsers in Swift vs other targets?

------
manishsharan
Compiler design was one subject that I struggled with when I was an undergrad.
I feel this has held me back from creating DSLs. I am looking for a gentle
introduction to ANTLR that can take me from newbie to mastery. Any books
anyone can recommend would be helpful.

~~~
breck
The Antlr-4 book is terrific:

[https://pragprog.com/book/tpantlr2/the-definitive-
antlr-4-re...](https://pragprog.com/book/tpantlr2/the-definitive-
antlr-4-reference)

~~~
kriro
Indeed. As someone with no real language background it allowed me to get to a
state where I can create my own DSLs.

An alternative would be to use something like Xtext (which also allows you to
create langauge tools, not just the language itself). I played around with it
and it's very nice (granted you'll kind of need to embrace Eclipse) but
ultimately decided that ANTLR was "enough".

------
carbocation
(Mostly notable for my purposes because it includes golang as a code
generation target.)

------
butner
Ya ANTLR (+TParr).

