
Let’s Build a Compiler (1995) - _virtu
http://compilers.iecc.com/crenshaw/
======
sigil
If you're looking for a modern compilers class -- including the theory of why
this stuff works -- I highly recommend Matt Might's [0]. All of the notes,
slides, and code are online.

I audited Might's "Compilers" this spring. He live-coded a parser that parsed
with derivatives, returning all possible parse trees whenever there were
ambiguities in the grammar. [1] (Try getting that from yacc, or basically any
other tool in existence right now.)

All of his coding was done in Racket scheme. At the beginning he told us we
could use whatever for the projects, but doing certain things in C++ / Java /
Python / some imperative language was "like bringing a knife to a gun-fight."

The final project was a working Python -> C translator.

Really badass class.

[0]
[http://matt.might.net/teaching/compilers/spring-2013/](http://matt.might.net/teaching/compilers/spring-2013/)

[1] [http://matt.might.net/articles/parsing-with-
derivatives/](http://matt.might.net/articles/parsing-with-derivatives/)

~~~
haberman
> returning all possible parse trees whenever there were ambiguities in the
> grammar. [1] (Try getting that from yacc, or basically any other tool in
> existence right now.)

Any GLR-based tool can do that trivially, including recent versions of Bison
(derived from yacc): [http://www.gnu.org/software/bison/manual/html_node/GLR-
Parse...](http://www.gnu.org/software/bison/manual/html_node/GLR-Parsers.html)

Ultimately I don't think this is the best way to develop parsers, particularly
when you are designing the language itself (as opposed to writing a grammar
for an existing language), because it gives you no hint that ambiguity exists
until you actually encounter an ambiguous string (since the question of
whether a grammar is ambiguous is undecidable).

I wrote more about this on my blog: [http://blog.reverberate.org/2013/09/ll-
and-lr-in-context-why...](http://blog.reverberate.org/2013/09/ll-and-lr-in-
context-why-parsing-tools.html)

------
snoonan
I went through this on my own back in a HS programming class! Glad to see it
here.

While the teacher was walking students through how to do loops, I got
permission to hack away in the back of the room on this. I ended up building a
BASIC-like interpreter with a decent graphics API. By the end of the class, my
project was a multi-level breakout game I'd written in the interpreter I'd
written. TLDR; two years later, I talked to a girl.

~~~
simcop2387
You're lucky, I was doing the same things in middle school. Took me until
college for that last part... :)

That same game project I started then still has been going on since then,
multiple compilers and VMs written to do parts of the game. It's ended up an
incredibly complex monstrosity with procedurally generated worlds and randomly
altered scripts (some actions just won't be available on some play-throughs).

~~~
xerophtye
Is the game (or its code) online somewhere? :P

~~~
simcop2387
Not yet, there's been many incarnation, but none of them ever reached a
playable state. I've been on a kick to restart it lately after starting to
learn prolog. I've been playing with some pre-made engines to find out if any
of them can accelerate the whole thing a bit. Checking out Unity right now.

------
groovy2shoes
This series is one of the best introductions to compiler construction. It
doesn't cover everything and it's 25 years old now, but it is the only guide I
know of that will hold your hand as you build a working compiler from scratch.

If you have never built a compiler before, I _cannot_ think of a better place
to start.

Afterward, if you're curious about theory and advanced topics, I recommend
heading to _Compilers: Principles, Techniques, and Tools_ by Aho, Sethi, and
Ullman (which covers a lot of theory associated with front-ends) then
proceeding to _Modern Compiler Construction in ML_ by Appel (which covers some
more advanced topics and back-end stuff). Then you can continue reading about
more specific/advanced topics if you like.

~~~
frigg
There is a class about compilers on coursera [1] which should be pretty good
and more up to date.

[1]
[https://www.coursera.org/course/compilers](https://www.coursera.org/course/compilers)

~~~
dochtman
Seconded. My experience:

Taking this course was an great way to learn more about compilers and fill a
hole in my CS curriculum. Professor Alex Aiken is a great instructor and
covers a good amount of material. I learned a lot about compiler construction
despite having toyed with my own compiler before starting the course. The
programming assignments were particularly tough, giving me useful experience
in building compilers and a great sense of achievement.

(TL;DR from my full blog post:
[http://dirkjan.ochtman.nl/writing/2012/07/21/compilers-on-
co...](http://dirkjan.ochtman.nl/writing/2012/07/21/compilers-on-
coursera.html))

------
petercooper
I followed this in the early 90s and had a lovely time. It helped that Turbo
Pascal was my language of choice though and might not be _quite_ so helpful
now although Pascal is a pretty good pseudocode..

It's not the same but Vidar Hokstad has been writing a series for several
years now in Ruby:
[http://www.hokstad.com/compiler/](http://www.hokstad.com/compiler/) .. and
other resources aplenty:
[http://stackoverflow.com/a/1672/3951](http://stackoverflow.com/a/1672/3951)

~~~
derleth
> Pascal is a pretty good pseudocode..

Pascal is pretty much directly derived from early dialects of Algol, which is
where most modern programming language syntaxes derive; it's the original
block-structured syntax, as opposed to line-structured assembly and early
FORTRAN and the fully-bracketed Lisp syntax (which is also block-structured if
you indent it sanely).

~~~
vidarh
And the reason Pascal is pretty much directly derived from Algol is that Wirth
was on the Algol committee, and kept trying to get it simplified and tightened
up. He then eventually went on to do Euler and Pascal (and of course later
Modula and Oberon) to implement the ideas he'd had for Algol.

This is quite interesting: Pascal and its Successors - Niklaus Wirth
[http://www.swissdelphicenter.ch/en/niklauswirth.php](http://www.swissdelphicenter.ch/en/niklauswirth.php)

------
aryastark
This is definitely the classic online text. However, I'm surprised no one
mentioned _An Incremental Approach to Compiler Construction_
([http://scheme2006.cs.uchicago.edu/11-ghuloum.pdf](http://scheme2006.cs.uchicago.edu/11-ghuloum.pdf)).
It is the basis of the Ikarus Scheme compiler.

Lisp in Small Pieces is also a useful book, for those interested in
Lisp/Scheme. It covers much of the same stuff as in the PDF I mentioned.

~~~
pjscott
I was about to mention _An Incremental Approach to Compiler Construction_ ,
but you've just saved me the trouble of digging up the link. It's a mere 11
pages, but it covers a lot of ground, in small, easy-to-digest pieces, and
invites you to follow along and get your hands dirty. Highly recommended.

------
ericbb
If you like Pascal, then Compiler Construction, by Niklaus Wirth is certainly
also worth a look:

[http://www.ethoberon.ethz.ch/WirthPubl/CBEAll.pdf](http://www.ethoberon.ethz.ch/WirthPubl/CBEAll.pdf)

Easy to read, concise, and good for beginners.

------
fusiongyro
Email me if you'd like an HTML version of this. I have converted most of it
for my own personal use (through part 11, IIRC) but don't want to distribute
it publicly since I'm unclear on the copyright status.

~~~
scott_s
Isn't contacting the author the natural thing to do? Some googling found a
user profile for him:
[http://www.embedded.com/user/JackCrens](http://www.embedded.com/user/JackCrens)

~~~
fusiongyro
He's cool with it. Thanks for digging that up.

------
zamalek
Personally this is my favorite work on compiler theory. While dragon book is
definitely more comprehensive, Let's Build a Compiler is far more approachable
(so approachable that I worked through it when I was only 14).

I haven't gone to the bare metal level since (as well as using parser
generators), but it's a great piece of work that gives you a slight
understanding of what YACC+family do under the covers (even though they are
different types of parsers). I continually recommend it as a starting point
for anyone who wants to learn how to write parsers.

------
gozzoo
I don't know if this is the same thing, but it looks like it:

pdf -
[http://www.penguin.cz/~radek/book/lets_build_a_compiler.pdf](http://www.penguin.cz/~radek/book/lets_build_a_compiler.pdf)

html - [http://www.penguin.cz/~radek/book/lets-build-a-
compiler/](http://www.penguin.cz/~radek/book/lets-build-a-compiler/)

------
smoyer
I remember reading this when he first published it (as well as everything he
ever wrote for "Embedded Systems Programming"). Jack Crenshaw is my favorite
Rocket (Computer) Scientist!

------
AlexanderDhoore
Reading the text files, using vim, which were written in the 80s and talk
about Borland and other old stuff, makes me... a bit sentimental.

~~~
segmondy
Talking about sentimental, This made me giggle, "there will probably never
come a time when a good assembler-language programmer can't out-program a
compiler."

Reading that reminded me why I'll never make predictions on computing,
especially on what can't be done.

~~~
vidarh
A good assembler-language programmer these days starts with the output of a
compiler. And then proceeds to beat the pants off said compiler by optimizing
specific portions.

Not least because no language includes sufficient semantic information for the
compiler to be able to safely optimize all the parts that the programmer
_can_.

It's still mostly true - just not worth the effort for anything but smaller
fragments.

~~~
nocman
Yeah, I was going to say something similar. You'll never get a compiler to be
as intelligent as a human being. At best you'll get human intelligence rapidly
applied by a computer to a problem. The computer will be able to do this
"manual labor" (if you want to call it that) faster than a human can. However,
a human being will be able to find ways to optimize for the problem that a
computer will not be able to.

I realize that many people believe that computers will some day be able to
truly think on a human level. I just don't happen to be one of those people.

------
monkey_slap
Would there happen to be a PDF or eBook format of this?

~~~
Jtsummers
PDF:
[http://compilers.iecc.com/crenshaw/tutorfinal.pdf](http://compilers.iecc.com/crenshaw/tutorfinal.pdf)

At the bottom of the page.

~~~
monkey_slap
Thanks!

------
statictype
I read this in high school. At some point a light bulb went off and I finally
got recursive descent parsers. Great series.

