
Want to Write a Compiler? Read These Two Papers (2008) - rspivak
http://prog21.dadgum.com/30.html
======
journeeman
'Compiler Construction' by Niklaus Wirth[1] is a pretty good book too. It's
got the hands-on feel of Crenshaw's book with a quick but, not too
superficial, introduction to theory. The book is little more than 100 pages
long.

The compiler is written in Oberon however, which is also the source language
(actually the source language is Oberon-0, a subset) but, Oberon is super
simple and can be learned on the go.

[1]
[https://www.inf.ethz.ch/personal/wirth/CompilerConstruction/...](https://www.inf.ethz.ch/personal/wirth/CompilerConstruction/index.html)

~~~
gnuvince
I wish someone would take that content and give it proper typesetting; the
content is quite good and accessible, but the presentation makes reading quite
unpleasant.

------
d0m
I agree. I remember my compiler class where we went through the whole Dragon
book and we had a lot of theoretical knowledge, but we were disappointed
because we practically couldn't build a compiler. We went in too much detail
too fast without having a good understanding and practical examples with
compilers.

~~~
santaclaus
> the whole Dragon book

I learned compilers from Al Aho, and believe me, the class was no better in
person. Mostly in-person waxing poetic about his time at Bell Labs...

~~~
dgacmu
Dinner with Al Aho was one of the highlights of when I visited Columbia a
while ago. He's simply an awesome person. And he's done some fantastic work -
Aho-Corasick is a great algorithm. Alas, the Dragon Book was one of the low
points of my undergraduate education, though I had a great teacher who pulled
out a terrifically fun class despite the book. (Thanks, +Wilson Hsieh.)

In fairness, though -- or more an admission of how wrong we can be about
things like this -- I subscribed to the "parsing is boring and solved" belief
until Bryan Ford's Packrat Parsing work made me realize that it had just
gotten stuck at a traffic light for a few decades.

~~~
mdcox
Veering a bit offtopic, but does anyone have any pointers to important recent
work on parsing? There are alot of papers out there and I guess I don't know
how to sift through them. I've heard the "parsing is solved" line before, but
so much of my time is spent doing some type of parsing that even incremental
improvements are extremely interesting.

~~~
sklogic
PEG (Packrat) is hot now.

~~~
nly
Interesting paper on left-recursion in recursive descent parsers like PEGs and
Packrats

[http://web.cs.ucla.edu/~todd/research/pub.php?id=pepm08](http://web.cs.ucla.edu/~todd/research/pub.php?id=pepm08)

~~~
ltratt
If you're interested in left-recursion in PEGS then, at the risk of gross
immodesty, you may be interested in
[http://tratt.net/laurie/research/pubs/html/tratt__direct_lef...](http://tratt.net/laurie/research/pubs/html/tratt__direct_left_recursive_parsing_expression_grammars/)

With less risk of immodesty you may also find
[http://arxiv.org/pdf/1207.0443.pdf?ref=driverlayer.com/web](http://arxiv.org/pdf/1207.0443.pdf?ref=driverlayer.com/web)
interesting.

There's probably more recent work than these two papers, but I'm a little out
of date when it comes to the PEG world.

~~~
nly
Thanks, looks like you've put a lot of work in to it and I'll enjoy reading
it.

------
Animats
I was amused by _" The authors promote using dozens or hundreds of compiler
passes, each being as simple as possible."_

That's kind of a desperation move. There was a COBOL compiler for the IBM 1401
with about 79 passes.[1] They only had 4K or so of memory, but they had fast
tape drives, so each pass read the previous intermediate form, did some
processing on it, and wrote out the next intermediate form. Except for the
passes which did sorts.

[1] [http://bitsavers.informatik.uni-
stuttgart.de/pdf/ibm/140x/C2...](http://bitsavers.informatik.uni-
stuttgart.de/pdf/ibm/140x/C24-3146-3_1401_cobolOper.pdf)

~~~
sklogic
This is the only sane approach. Small passes are easy to write, easy to
understand and easy to reason about.

And if yoy worry about performance, your compiler can fuse passes together in
many cases.

~~~
vidarh
Have you tried this approach? If so I'm curious to hear about experiences.

My own experience with far fewer stages is that while it becomes easy to
understand what each stage does and how, it becomes hard to keep track of how
each stage interact, as each intermediate output in effect becomes a language
dialect of sorts.

I'm still not decided on whether it's a net win or loss.

~~~
cottonseed
Yes. I co-founded a startup building optimizing compilers for high-performance
telecommunications processors in the early 2000s (acquired by Intel). In any
complex software system, building powerful, composable and modular
abstractions is critical for managing complexity. Having many composable
passes is exactly that. We were a small team (6 people) and a modular
architecture maximized our productivity and made it possible to complete with
teams many times our size. We used a unified IR throughout the compiler so it
had a consistent semantics, but the IR could support various canonical forms
or constraints that were managed explicitly by the compiler. Instruction
formation involved three steps: fragment selection by tree covering (think of
load with post increment or a SIMD instruction as multiple fragments),
grouping of fragments into instructions, and grouping of instructions into
issue slots in the scheduler (several of our targets were VLIW). In addition
to constraint, we had properties like SSA vs multiple assignment, structured
control flow vs basic blocks, certain kinds of cleanup, etc. A pass could
declare its requirements (SSA, structured control flow, pre-fragment) and the
compiler would check the properties, possibly running passes to create them
(e.g. put in SSA form). Having separate cleanup passes meant transformations
didn't have to clean up after themselves, often making things vastly simpler.
Analysis passes (dataflow, alias analysis, various abstraction
interpretations) were independent of the properties because the IR had unified
semantics. That said, it is true our compiler didn't have the fastest compile
times (multiple times slower than gcc on targets we both supported), but we
were a highly optimizing compiler and our generated code was significantly
faster than the competing alternatives (if there were any).

~~~
vidarh
> In any complex software system, building powerful, composable and modular
> abstractions is critical for managing complexity.

Yes, but as Wirth showed already in the 70's, you don't even need an IR in
order to do this, much less separate passes.

For a highly optimizing compiler like yours the complexity might have been
unavoidable anyway, though (a lot of the simplicity of Wirth's compilers comes
from a long held insistence that no optimization could be added to the
compiler unless it sped up the compilation of the compiler itself - in other
words, it needed to be simple enough and cheap enough to apply to the compiler
source code to pay for itself... needless to say this implicitly means that
most Wirth-compilers omit a lot of optimizations that are usually included
elsewhere, though some of his students did implement some impressive
"unofficial" optimizing variations of his compilers that were also blazing
fast)

------
bonobo3000
If you're totally new to this, Peter Norvigs guide to a scheme interpreter in
Python[1][2] is the best! Its short and sweet and gave me enough of an idea of
the basic parts to start piecing stuff together.

[1] [http://norvig.com/lispy.html](http://norvig.com/lispy.html) [2]
[http://norvig.com/lispy2.html](http://norvig.com/lispy2.html)

------
lmm
Post-2008 I'd really push
[http://www.hokstad.com/compiler](http://www.hokstad.com/compiler) . Writing a
compiler in the same way you'd write an ordinary program, this was the first
explanation where I actually understood the rationale for the choices being
made.

~~~
vidarh
Thanks for the recommendation, though I'd give the caveat that I've been
rambling all over the place, especially in the later parts. I also have lots
of changes after the latest part that I need to write about...

I'd really like to get a chance to take the time to retrospectively go over
and tighten it up. The trouble with that is that it's easily 10 times+ as much
effort to follow along with a project this way and write about each change as
it is to cover the finished code (especially thorny issues such as how to
avoid wasting too much time on bug fixes that may or may not have affected
understanding of previous parts).

Personally, while I'm very happy that people find my series useful, I'd
recommend perhaps supplementing or starting with Niklaus Wirth's books or
Crenshaws "Let's write a compiler" (one of the ones recommended in the linked
artile) for something much more focused. I feel the angles are sufficiently
different that it's worth it. And Wirth is my big hero when it comes to
language and compiler construction.

While the other suggestion in the article is well worth a read, I'd give one
caveat: Lots of passes is _hard_ to keep mental track of. My compiler sort-of
does something similar in applying multiple transformation stages to the AST,
and I'm mulling collapsing it into fewer stages rather than more because I
feel it's getting more complex than necessary. You can work around some of the
debugging complexity by outputting each stage to a suitable format so you can
test each transformation individually, but it's still hard to keep track of
exactly what each stage expects and delivers.

~~~
lmm
> You can work around some of the debugging complexity by outputting each
> stage to a suitable format so you can test each transformation individually,
> but it's still hard to keep track of exactly what each stage expects and
> delivers.

Honestly this sounds like a good case for a type system - which of course
isn't available in ruby.

~~~
vidarh
Of course it's available in Ruby. You just don't get _ahead-of-time_
verification without lots of extra trouble (implementing it yourself). The
problem is not that you can't specify pre-conditions, and use the type system
to do it.

The problem is that either you create specialized IR's for each stage, or a
lot of this is expressed through tree shape, and it's just painful to mentally
keep track of more than it is painful to test.

------
toolslive
"Implementing functional languages: a tutorial" Simon Peyton Jones
[http://research.microsoft.com/en-
us/um/people/simonpj/Papers...](http://research.microsoft.com/en-
us/um/people/simonpj/Papers/pj-lester-book/)

Is very good and shows different strategies for the runtime.

------
emmanueloga_
On a related subject, two good resources to learn how to write interpreters
are [1] and [2]. The nice thing about approaching programming language
implementation with Scheme is that the whole parsing subject is unneeded (the
interpreters work by manipulating S-expressions directly), or can at least be
delayed until later.

1:
[http://cs.brown.edu/courses/cs173/2012/](http://cs.brown.edu/courses/cs173/2012/)
2: [http://www.eopl3.com/](http://www.eopl3.com/)

------
userbinator
There's an x86 version of Crenshaw's excellent tutorial here:
[http://www.pp4s.co.uk/main/tu-trans-comp-jc-
intro.html](http://www.pp4s.co.uk/main/tu-trans-comp-jc-intro.html)

I'd definitely recommend it highly; the only way it could be better is if it
arrived at a compiler that could compile itself.

Another article which I think everyone attempting to write a compiler should
read is
[https://www.engr.mun.ca/~theo/Misc/exp_parsing.htm](https://www.engr.mun.ca/~theo/Misc/exp_parsing.htm)
\- how to simply and efficiently parse expressions.

------
cbsmith
I haven't read the Nanopass Framework paper, but...

The idea of doing tons of compiler passes works well in a functional context,
where functional separation i the preferred method of abstraction, and you can
use composition to effectively transform all your passes into a single pass.

However, actually _doing_ that many passes seems like it is going to end
badly. Even if you don't touch disk, you're doing mean things to the memory
subsystem...

~~~
sklogic
It is much more memory-efficient than thrashing a single graph over and over
again. Rewriting and compacting trees is good for cache locality.

Also, for the trivial rewrites (e.g., variable renaming) your compiler can
safely choose to replace a cooying rewrite with a destructive update.

~~~
cbsmith
> It is much more memory-efficient than thrashing a single graph over and over
> again. Rewriting and compacting trees is good for cache locality.

To be clear: if you are thrashing a single graph over-and-over again, you are
doing many passes. Rewriting and compacting trees is just a way to make the
multiple passes a bit less painful.

~~~
sklogic
> if you are thrashing a single graph over-and-over again, you are doing many
> passes.

A single "pass" can be thrashing a graph many times too, think of the things
like instcombine (in LLVM parlance) or ADCE.

Essentially, compilation is rewriting of a source AST into the target machine
code. It is up to you how to structure this rewrite, but the amount of work to
be done remains the same.

For a human it is much easier to structure such a rewrite as a sequence of
very trivial changes. And a "sufficiently smart" compiler must realise that it
can be done in less steps.

~~~
vidarh
> Essentially, compilation is rewriting of a source AST into the target
> machine code.

While that is true of most modern compilers, do consider that there's the
whole Wirth school of compilers that don't use an AST at all but generate
machine code directly during parsing.

> For a human it is much easier to structure such a rewrite as a sequence of
> very trivial changes.

I'm not convinced, as I've written elsewhere, because while the changes may be
trivial, you need to keep track of what a program is expected to look like in
between each stage, and that becomes harder the more stages you introduce.
E.g. my forever-in-progress Ruby compiler has a stage where it makes closures
explicit: It allocates an environment to store variables in and rewrites
variable accesses accordingly. For following stages the language I'm working
on is different: I can no longer use closures.

The more stages you have, the more language dialects you need to manage to
mentally separate when working on the stages. And if you decide you need to
reorder stages, you often have painful work to adapt the rewrites to their new
dialects.

The end result looks very simple. But try to make changes and it's not
necessarily nearly as simple.

~~~
cbsmith
You can solve this by having one "language" that can represent all
intermediate stages. This has a nice side effect of allowing you to easily
change the ordering of passes if that turns out to produce better results.

~~~
sklogic
In my experience such languages are very dangerous and it is better to have a
chain of more restrictive languages instead. Most passes only make sense in a
fixed sequence anyway. LLVM is infanously broken in this regard, btw., there
are too many implicit pass dependencies.

E.g., there is no point in keeping a one-handed 'if' in a language after a
pass which lowers it into a two-handed 'if'. There is no point in keeping a
'lambda' node in your AST after lambda lifting is done.

------
PaulHoule
I like the nanopass concept, particularly I am applying it to domain specific
languages. It is not necessarily so inefficient if it is optimized. For
instance, data indexing and methods such as RETE networks can make the cost of
running an irrelevant pass close to nothing.

------
fuzzieozzie
You want to be paid to write a compiler? Want to learn some techniques not yet
covered in the literature?

Take a look at
[http://www.compilerworks.com/dev.html](http://www.compilerworks.com/dev.html)

------
kybernetyk
>Now imagine that it's more than just a poor choice, but that all the books on
programming are at written at that level.

Hmm, I personally found the "Dragon Book" very accessible. And I'm someone
without formal CS education.

