
A programming language from scratch: 3 steps to an interpreter - FrancisStokes
https://francisstokes.wordpress.com/2017/08/16/programming-language-from-scratch/
======
justinpombrio
Shameless plug for the programming language course I'm TAing. It covers how to
write a similar interpreter, with more detail about evaluation and less about
parsing. The book and all of the assignments are online.

[http://cs.brown.edu/courses/cs173/2016/index.html](http://cs.brown.edu/courses/cs173/2016/index.html)

[http://cs.brown.edu/courses/cs173/2016/assignments/programmi...](http://cs.brown.edu/courses/cs173/2016/assignments/programming/interp.html)

[http://cs.brown.edu/courses/cs173/2016/assignments.html](http://cs.brown.edu/courses/cs173/2016/assignments.html)

The "interpreter" assignment has a similar end-goal as this tutorial.

Many of the other assignments (the ones prefixed "ML-") are "mystery
languages", where you try to tell the difference between two programming
languages that have the same syntax but different semantics. It's a fun way to
explore the consequences of language design, I think.

Next year we might do a MOOC.

~~~
Twisol
I can vouch for the mystery language assignments at least -- we worked through
several at this year's Racket summer school ([https://summer-school.racket-
lang.org/2017/](https://summer-school.racket-lang.org/2017/)), and they were
interesting and very engaging. (Hi Justin!)

~~~
totalperspectiv
This course looks awesome! Do you (or those scrolling by) know of any other
courses like this? By this I mean short but focused burst of learning done in
person.

~~~
Twisol
Yes, there seem to be a number of summer schools every year on a variety of
topics, although the ones I'm familiar with are based more in academia than in
industry. I think they tend to get announced on the Types mailing list
([https://lists.seas.upenn.edu/mailman/listinfo/types-
announce](https://lists.seas.upenn.edu/mailman/listinfo/types-announce)).

Here are a few that I'm aware of:

* Racket summer school: [https://summer-school.racket-lang.org/2017/](https://summer-school.racket-lang.org/2017/)

* Oregon Programming Languages Summer School: [https://www.cs.uoregon.edu/research/summerschool/summer17/](https://www.cs.uoregon.edu/research/summerschool/summer17/)

* DeepSpec summer school: [https://deepspec.org/event/dsss17/](https://deepspec.org/event/dsss17/)

------
userbinator
IMHO Lisp-like languages are good for showing off the "back-end" of a
compiler/interpreter, because tokenisation/parsing is completely trivial. On
the other hand, you miss out a lot of the fun of parsing a "real" language,
with precedences and a more nontrivial grammar.

One of the most enlightening points about interpreters/compilers is that, to a
first approximation, the only difference between an interpreter and compiler
is what it does when it walks the tree --- an interpreter executes the actions
immediately, while a compiler generates code which executes those actions.

An amazing demonstration of this is at
[https://news.ycombinator.com/item?id=8558822](https://news.ycombinator.com/item?id=8558822)
and
[https://news.ycombinator.com/item?id=8746054](https://news.ycombinator.com/item?id=8746054)
. It doesn't actually build an AST, but implicitly creates one and walks it
since it does recursive-descent parsing.

(Aside: I happened to glance at the URL at the same time as I was reading the
Tokenisation section, and thought "Francis Tokes? How appropriate..." before
looking more closely at the page title.)

~~~
FrancisStokes
Hey - author here. You're completely right, but I hope the basic understanding
will whet peoples appetites to try some more exotic stuff! Really I was just
trying to lift the veil of what can be a quite intimidating subject if you
haven't grazed past it before. Thanks for checking it out!

~~~
jeffhuys
Hey Francis, I just wanted to take a moment to tell you that I really enjoyed
reading your article. Good stuff!

~~~
FrancisStokes
Thanks! I really appreciate you saying so, and I'm glad you enjoyed it

------
atemerev
If you have an interpreter, you can obtain a native executable by attaching
the interpreter code to the source file, and then removing all the unnecessary
parts (optimizing). This is called the first Futamura projection.

Discovering the nature of second and third Futamura projections is left as an
exercise to the reader ;)

~~~
Sohcahtoa82
> This is called the first Futamura projection.

I read that as "Futurama" at first and was confused trying to figure out how
that TV show was related to interpreters and compilers.

------
tzs
That looks like fun. Here's a series of HN comments that outline a similar
thing, but going FORTH-like instead of Lisp-like [1].

I suppose Lisp and FORTH are the most common for these kind of projects, at
least that I have seen, because of the simple syntax.

APL has a pretty simple syntax, but I have not seen anyone choose it for one
of these build your own language from scratch projects. I wonder why not?

[1]
[https://news.ycombinator.com/item?id=13082825](https://news.ycombinator.com/item?id=13082825)

~~~
agumonkey
They saw Whitney (J author) threaded interpreter in C and they ran away.

[http://code.jsoftware.com/wiki/Essays/Incunabulum](http://code.jsoftware.com/wiki/Essays/Incunabulum)

------
brudgers
A couple of other resources that come to my mind are,

The somewhat more in depth _Beautiful Racket_ ,
[http://beautifulracket.com/](http://beautifulracket.com/)

and the significantly more in depth _Structure and Interpretation of Computer
Programs_ , [https://mitpress.mit.edu/sicp/full-text/book/book-
Z-H-4.html...](https://mitpress.mit.edu/sicp/full-text/book/book-
Z-H-4.html#%_toc_start)

------
zerr
I wish such content existed for ACID compliant databases.

~~~
gedrap
Yes! I was looking for something similar to create a basic database in order
to really understand the underlying concepts, but couldn't really find
anything.

~~~
overcast
Sounds like a good project!

------
linopolus
Am I completely misunderstanding something here or should (* 3 3 3) be
actually 27 instead of 9?

Anyway, nice article, well illustrated, love to read stuff like this!

~~~
FrancisStokes
Oh my god what a glaring mistake! Thank's for pointing it out, I've corrected
the article.

------
c0achmcguirk
Using a Lisp-like language is great for creating a new language because of the
S-expressions that Lisp depends on.

S-expressions are very compact and are easy to parse. There was actually some
support for using S-expressions over XML and JSON as a serialization
language.[1]

Ron Rivest even proposed S-expressions as a standard.[2]

I still prefer JSON over them though. As the old joke goes, LISP stands for
"Lots of Irritating Silly Parentheses."

1 - [http://eli.thegreenplace.net/2012/03/04/some-thoughts-on-
jso...](http://eli.thegreenplace.net/2012/03/04/some-thoughts-on-json-vs-s-
expressions)

2 -
[http://people.csail.mit.edu/rivest/Sexp.txt](http://people.csail.mit.edu/rivest/Sexp.txt)

------
jwdunne
After grokking the tokenisation, parsing and eval of lisp like languages,
what's the next step up?

I usually have a bit of fun mocking out a language I'd like but end up getting
into details like type systems. I never really get to expressing as a CFG.

I'm wondering if the best course is to say right, keys now build a small,
procedural language or some kind of barely typed C lookalike as the best step
up?

~~~
mietek
Why not give Pierce’s “Types and programming languages” a spin?

~~~
cardiffspaceman
I agree, for sufficiently large values of step. This book could change your
life.

But I think the C4 self-hosted C compiler (discussed at [1]) might be another
next step in a different direction.

[1]
[https://news.ycombinator.com/item?id=8558822](https://news.ycombinator.com/item?id=8558822)

~~~
jwdunne
This intuitively makes a lot of sense. The short variable names with the
legend at the top makes it a little harder to understand but you can see
what's going on from a macroscopic view. Perhaps translating this to another
language with clearer variable names would be a good exercise for me.

------
novakinblood
quote from article: "As a side note: if Lel ever becomes popular enough that
people argue about tabs vs spaces, or if function bodies belong on a newline
always, that will absolutely make my life."

That unexpectedly made me LOL.

