
A meta approach to implementing programming languages (2018) - rickardlindberg
http://rickardlindberg.me/writing/rlmeta/
======
corysama
The demonstration of this technique by the VPRI team was very impressive.
[http://www.vpri.org/pdf/tr2012001_steps.pdf](http://www.vpri.org/pdf/tr2012001_steps.pdf)
Shame that team seems to have gone quiet (not dead) since then.

Here’s a video that covers the concepts in a more entertaining format.
[https://youtube.com/watch?v=ubaX1Smg6pY](https://youtube.com/watch?v=ubaX1Smg6pY)

~~~
curuinor
They turned into Dynamicland

[https://dynamicland.org/](https://dynamicland.org/)

------
jimmy_ruska
Some links worth mentioning

[https://www.colm.net/open-source/ragel/](https://www.colm.net/open-
source/ragel/)

[https://beautifulracket.com/](https://beautifulracket.com/)

[https://www.jetbrains.com/mps/](https://www.jetbrains.com/mps/)

[https://kotlinlang.org/docs/reference/type-safe-
builders.htm...](https://kotlinlang.org/docs/reference/type-safe-
builders.html)

[https://en.wikipedia.org/wiki/ANTLR](https://en.wikipedia.org/wiki/ANTLR)

[https://github.com/llaisdy/beam_languages](https://github.com/llaisdy/beam_languages)

------
Davidbrcz
If you are interested by uncommon approach to implementing programming
language, toy should have a look at the K framework
([http://www.kframework.org/index.php/Main_Page](http://www.kframework.org/index.php/Main_Page))

The idea is to write the spec (is grammar and semantics) only once and have
for free a compiler as well as some formal tools (model checker,...)

C had been formalized with it for instance.

~~~
tom_mellior
K is really cool, but does really give you a compiler? That's the first I
heard of that. Do you have (a link to) more information about how it does
compilation?

~~~
Davidbrcz
No sorry, I thought it was a compiler but it's "only" an interpreter.

------
BoiledCabbage
RLMeta/Meta II/OMeta are a fascinating branch of programming languages. And a
direction I could see us moving towards.

Concise methods of abstracting a problem domain.

Nice to see the author's implementation.

~~~
asqueella
The grammar definitions in the article looked very similar to PEG at the first
glance. And the article seems to confirm, “The parsing algorithm that RLMeta
implements is based on parsing expression grammars (PEG), but is extended to
match arbitrary objects, not just characters.”

I didn’t quite get the “match arbitrary objects” bit... Is this basically the
same technique as PEG or what am I missing?

~~~
rickardlindberg
As far as I know, PEGs work only on streams of characters. RLMeta works on
streams of arbitrary Python objects. The syntax currently has support for
matching characters, strings, and lists. Does that clarify it?

~~~
asqueella
Thanks, but it still seems to me that the first (“parser”) part is very much
like a PEG.

It’s the “generator” part that matches on objects (AST), generated by the
parser, but I’m not sure why that’s useful: \- in the toy example the
evaluation could happen right in the parser, \- and if you decide to have an
AST, why use a DSL that looks similar to the grammar definition to walk it?

~~~
rickardlindberg
Yes, you are right that the parser grammars are much like PEGs and it's the
code generator grammars that match other objects than characters.

The reason I chose to implement it that way in the article is that I modeled
RLMeta after OMeta. If I remember correctly, their reason for doing so was
that they thought compilers would be easier to write if all passes (parsing,
code generation, ...) could be written using the same language.

I think using the same DSL for code generators works quite well and they
become quite readable. Perhaps there is a better DSL suited for code
generation, and it would be interesting to experiment with that as well.

You are right that the toy examples could generate code directly in the
parser. There is no need for a separate code generator step. But the RLMeta
compiler would be much less readable without a separate code generator step I
think. The reason I introduced an AST in the toy example was to introduce the
concept before I showed it in the RLMeta implementation.

~~~
asqueella
OK, thanks for the clarification!

------
oscardz88
I think you should definitely take a look to Meta Programming System (MPS)
from JetBrains:
[https://www.jetbrains.com/mps/](https://www.jetbrains.com/mps/)

------
DyslexicAtheist
could I get something like this with Racket[1]? why would I chose this over
Racket?

[1]
[https://news.ycombinator.com/item?id=19232068](https://news.ycombinator.com/item?id=19232068)

~~~
jimmy_ruska
From my understanding this library takes a grammar and generates a run time
interpreter for that grammar. Antlr is much more similar as it takes in a
grammar, and generates code you can hook into at any part of the grammar
parsing.

Racket on the other hand gives you the ability to hook into the new language,
convert it to racket syntax and then compile it. As far as I know racket
doesn't give you a simple parsing grammar, and you're targeting outputting
racket code if you make a language. Though you could always take racket as
your AST and target something like C or LLVM if you want it to live outside of
racket.

Racket is not necessarily unique in that aspect, it just focuses on making it
easier with tooling. Scala, Kotlin, Elixir are examples of some of the most
successful languages
[https://www.youtube.com/watch?v=du6qWa8lWZA](https://www.youtube.com/watch?v=du6qWa8lWZA)

~~~
jarcane
Racket had parser generators available. See for instance ragg, or is fork, the
brag parser used in the Beautiful Racket book: [https://docs.racket-
lang.org/brag/index.html](https://docs.racket-lang.org/brag/index.html)

The technique used in the book is to use brag to describe the grammar and
generate a parser/tokenizer to s-expressions, and then you can either
interpret it compile them directly, or use the Racket macro system itself to
expand them into Racket.

And of course, Racket's macro system itself can do quite much without even a
parser generator. The standard library even includes an example ALGOL 60
implementation done as Racket macros, and examples of C and Java can be found
on the package server.

------
mruts
At this point, I think that it’s a bad idea to use things like yacc or lex for
parsing and tokenization. Parser combinators are available in many languages
and are much easier to use and much more expressive. Of course if you need
insane performance, a hand written recursive descent parser is probably the
best option.

The only reason people used parser generators in the past is because the
languages they were using (mostly C I suppose) weren’t expressive enough to
support higher order functions and higher kinded types.

~~~
Darmani
I'm a Ph. D. student in programming languages. I've written at least 7
language frontends over the years, read many more, and have given a talk on
parsing algorithms. My primary programming language is Haskell.

I really don't like parser combinators.

I've found that, with parser generators, bugs in the parser are very rare.
When they exist, you can find them just by carefully inspecting your grammar.

With parser combinators, a misplaced "try" can give you a parser that looks
correct and usually works, but some odd placement of tokens makes it fail to
parse something it should.

I agree that yacc is not very good, but it's hardly state of the art. It's a
LALR generator, and hence very restricted.

Instead, you can use:

(1) ANTLR, which has a huge ecosystem, produces very fast generators, and can
take in an almost-arbitrary context-free grammar. The ALL(*) algorithm makes
it far superior to Yacc.

(2) A GLR parser-generator, which can take in an actually arbitrary grammar,
and is usually very fast. The one I have experience with is Semantic Designs
DMS. Michael Mehlich of Semantic Designs has produced a C++ parser that can
accurately parse all the major C++ dialects (MS, GCC, Borland, etc), all using
general machinery. The only other group that's come close is the Edison Group,
which has a huge hand-written C++ parser, and has 5 people full-time on it.

~~~
anaphor
What about PEGs? You can get the benefits of writing down a grammar that you
can analyze yourself, and you can call methods on the production results, etc.

~~~
ufo
PEGs have the same downsides the parser combinators have. If you put things in
the wrong order you can have programs that work most of the time but do the
wrong thing on a corner case or in a parsing ambiguity.

