
Let's make a Teeny Tiny compiler - whack
http://web.eecs.utk.edu/~azh/blog/teenytinycompiler1.html
======
cloogshicer
If you want to dive deeper into this topic, I also recommend the wonderful
book Crafting Interpreters by Bob Nystrom. It can even be read for free
online: [http://craftinginterpreters.com/](http://craftinginterpreters.com/)

I'm not affiliated with it, but I devoured it recently and thought it was
amazing.

~~~
azhenley
Author of Teeny Tiny here! I really like Crafting Interpreters too (I link to
it at the bottom). Also highly recommend the book Writing an Interpreter in
Go.

~~~
kqr2
Link to _Writing an Interpreter in Go_ website:

[https://interpreterbook.com/](https://interpreterbook.com/)

------
melling
I’ve got a bunch of links about compilers in a GitHub repo:

[https://github.com/melling/ComputerLanguages/blob/master/com...](https://github.com/melling/ComputerLanguages/blob/master/compilers.md)

~~~
cellularmitosis
Nice! I have been collecting a list as well :)
[https://gist.github.com/cellularmitosis/1f55f9679f064bcff029...](https://gist.github.com/cellularmitosis/1f55f9679f064bcff02905acb44ca510#simple-
compiler-codebases-to-study)

------
boudewijnrempt
What I always admired about the ZX-Spectrum's Basic was that it skipped the
entire idea of a lexer, because, well, the keyboard let you input tokens
directly.

~~~
Lerc
I think it was closer to the other way around. The keyboard let you input
tokens because it didn't have a lexer.

I suspect this is more due to the memory cost of storing a literal line of
code more than the ROM or CPU cost of a lexer. Going back to the ZX-80 storing
even the current line being typed as characters was probably a burden.

~~~
tachyonbeam
AFAIK the keyboard was also just very cheap and a pain to type on.

~~~
webreac
I don't remember complaining about keyboard. The only issue was loosing memory
because the memory expansion (16ko) had bad contacts.

~~~
thom
Ha, my dad tells me we had a strict rule of no running in the house while he
was working on the Spectrum for this exact reason.

------
lixtra
I wonder if he left out the

    
    
        LET nums = nums - 1
    

on purpose.

------
kragen
Unfortunately this is not yet a compiler; it's a lexer and a parser.
Traditionally these are the first things you present in a compilers course,
but from my point of view, that's kind of dumb, for two reasons.

First, a parser isn't a compiler, so this means you get through the whole
first part of the course without having a compiler. By contrast, a code
generator _is_ a compiler, at least if there's _some_ way to invoke it.

Second, although the theory of formal languages and parsers is complex, well-
developed, and fascinating, it's not clear that it's _important_ to building a
compiler. People have written large systems in FORTH. MUMPS famously ran a
number of hospital systems without being able to break a statement across
lines. And a complete scannerless BNF grammar for S-expressions (without
readmacros) is something like this:

    
    
        <sexp> ::= <idchar>+ " "* | "(" " "* <sexp>* ")" " "*
    

And _many, many_ huge systems have been written in Lisps whose grammars amount
to little more than that.

My favorite grammars are PEGs, which accommodate scannerless parsing rather
better than LL(1) and LALR(1) parsers, because PEGs have infinite lookahead
with worst-case linear-time parsing (at the expense, it must be admitted, of
being a huge memory hog). PEGs are also composable in a way that LALR and LL
grammars aren't. I wrote a one-page PEG parser generator targeting JS at
[https://github.com/kragen/peg-
bootstrap/blob/master/peg.md](https://github.com/kragen/peg-
bootstrap/blob/master/peg.md) a few years back.

But I think an excessive focus on the _syntax_ of a programming language
really detracts from what's really revolutionary about programming languages
(and compilers), which is their _semantics_. We have much better theories of
semantics now than we had in the 1970s when the traditional compilers course
was being laid out. Books like _Essentials of Programming Languages_ can be
wonderful introductions to the fuzzier ones, and there's a lot of work in
things like Coq and Idris to come up with tractable formalizations. But you
don't need much of a theory of semantics to get a simple compiler up and
running!

But I haven't ever actually taught a compilers course, so maybe my ideas are
miscalibrated about what students would enjoy and find motivating (having a
working compiler for a simple language after doing the first problem set) and
what students will find difficult out of proportion to any rewards it might
bring (getting recursive-descent parsers for complex grammars working,
debugging precedence rules, refactoring grammars to eliminate left recursion,
encountering unexpected exponential-time parse failures, etc.).

~~~
abhayb
CMU's Compilers course (15-411)[1] does a lot of what you want. Students write
code generation code in the very first assignment. A particular delight of
this approach is that the first assignment is for compiling straight line code
while one of the later ones is for optimization. If you implement SSA form for
the optimization assignment, all of the test cases from the first assignment
compile down to effectively one precomputed result. Very clear marker for how
far you've progressed!

[1]
[https://www.cs.cmu.edu/~janh/courses/411/16/](https://www.cs.cmu.edu/~janh/courses/411/16/)

~~~
kragen
Thanks! This looks really cool! There are a number of courses coming out with
this kind of philosophy. I really like what I've seen of the nanopass
framework, for example.

------
signaru
Thankful that is a dialect of BASIC! The other great references mentioned here
parse C-like languages. Although I can try figuring out, I'm interested to see
a good tutorial that does parse BASIC syntax. Also interested to see how it is
done in Python (whereas many use C/C++ or Java which have many features that
make things a little easier). Looking forward to seeing this series finished.

------
bluedays
I really like this.

Python is the first language I learned. So it's like this is being written in
my native tongue.

I've read crafting interpreters, numerous times both before and after learning
some Java basics, but seeing it in Python made it click.

Also love the way you used classes. That helped clarify some confusion I had
with classes in Python.

~~~
yardshop
I'm working through Crafting Interpreters right now, converting the Java to
Python as I go. Most things are fairly straight forward to convert, but there
are definitely some topics that I've had to learn about in Java (not a
language I know very well) so I could figure out how to do it in Python.
Really great book and great learning project.

------
mrlonglong
REPEAT seems superfluous as only ENDWHILE is needed to end the loop. And, oh
yes who forgot to decrement nums?

Edit: s/N/nums

------
ngcc_hk
Python!? Anyway wait for part 2 and 3.

~~~
azhenley
I released Part 2 yesterday! (I had forgot to update the link, which is now
fixed.)

[http://web.eecs.utk.edu/~azh/blog/teenytinycompiler2.html](http://web.eecs.utk.edu/~azh/blog/teenytinycompiler2.html)

------
dntbnmpls
Not a fan of people sneaking referral links here. You could have easily
recommended the book without the referral link and if you were going to link
it, be open about it being a referral link.

~~~
azhenley
Didn’t mean to, sorry. I just copied and pasted it from the blog post (which
is labeled). I’ve removed it.

~~~
bogomipz
You don't need to apologize. Their comment is uncalled for and just seems
mean-spirited. I thoroughly enjoyed your article and am looking forward to
future posts. I hope you post them here when they're up! Cheers.

~~~
dntbnmpls
It wasn't mean spirited and it was justified. If people are using link
shorteners to hide the fact that their link is a referral, then it invites
everyone to sneak referral links in their comments here. Do you really want
that?

He removed that link from his comment and he apologized because it was the
right thing to do. Also, virtue signaling to encourage bad behavior isn't
right, no matter how good or superior you think it makes you feel. Especially
considering you weren't smart enough to realize that I wasn't criticizing the
referrals in his article ( valid ), I was criticizing the shortened link in
his comment which was a referral.

~~~
bogomipz
Yeah, no. In civil discourse you give people the benefit of the doubt that
they weren't doing something for nefarious purposes but that they maybe made
an honest mistake. In this case it was simply a copy paste error which they
pointed out. They apologized because they were being civil and polite not out
of some admission of guilt. Somehow you were the one that missed all of this.
I think you can see what others thought of your tone as well. Both of your
comments here come across as very angry.

~~~
dntbnmpls
> In this case it was simply a copy paste error which they pointed out.

Right and life went on before you decided to play "hero".

> They apologized because they were being civil and polite not out of some
> admission of guilt.

But you told him not to apologize. So that means you told him not to be civil?
Also, it's not polite to speak for others as it isn't part of civil discourse.

> I think you can see what others thought of your tone as well.

Others? Just you. And it's rather childish to "appeal to downvotes".

> Both of your comments here come across as very angry.

"Yeah, no."

Listen, the only one causing drama here is you. I made one innocuous and
helpful comment. The guy responded and everything was settled. The
conversation had nothing to do with you and yet here you are.

------
userbinator
There seems to have been an explosion of such articles within the past few
years, despite things like Crenshaw's tutorial existing for several decades; I
wonder why a large number of people have suddenly started thinking "I
understand this, I should write an article about it".

My favourite demonstration of a "teeny tiny compiler" is still C4:
[https://news.ycombinator.com/item?id=8558822](https://news.ycombinator.com/item?id=8558822)
Perhaps it inspired all these compiler articles?

~~~
staycoolboy
<I deleted this because it wasn't nice>

~~~
azhenley
Actually, my students were having trouble following other tutorials so I wrote
this for them. (I also just love talking about compilers!)

I’d summarize my tutorial as a shorter version of Crenshaw’s.

I love seeing more and more compiler tutorials. They all have different goals
and perspectives.

~~~
staycoolboy
Sorry, my comment wasn't in the HN spirit.

