
Compiler Design in C - adamnemecek
http://www.holub.com/software/compiler.design.in.c.html
======
ltta
If you are interested in a printed version of a great beginner's compiler
book, I can highly recommend Andrew W. Appel's "Modern Compiler in C". There
are also versions of the book written for SML or Java. I have read and
partially implemented the C version and I really enjoyed the book, basic
enough to follow yet full-featured enough to be useful.

The book starts with parsing (I prefer PEG or Pratt parsers for their
simplicity (and tool independence) to be honest and skipped some of that
chapter) but then goes into semantic analysis, type checking, code generation,
optimization passes, even mentions basic type inference.

There is some code online at
[http://www.cs.princeton.edu/~appel/modern/c/](http://www.cs.princeton.edu/~appel/modern/c/)
.

~~~
makeset
Great book, though I recommend the SML version over C or Java for a clearer
illustration of the concepts in code. Viewed side by side, the SML code looks
concise and obvious vs. the long and wordy C/Java code. It just seems to lend
itself better to the task, in that some of the pattern matching burden is
already handled by SML.

~~~
ltta
Totally agree, ML-style languages are great for writing compilers.

------
userbinator
This book appears to be more of a "compiler-compiler design in C"; it goes
through how to write a lexer and parser generator, _then_ writes a compiler
using them, and I think the resulting compiler is a bit of a letdown: it does
not much more than translate C into a linearised subset of C, and so the IMHO
more "interesting" and important parts of the back-end like register
allocation and instruction selection are completely absent.

It's a good if somewhat outdated book if you're interested mostly in parsing
and lexing, but for all the claims it makes in the preface about being
practical instead of theoretical and all the source code presented throughout,
I found the lack of actual Asm code generation (or any mentions of this
compiler being able to compile itself) disappointing.

Parser/lexer generators also seem to have fallen out of favour for the
creation of actual compilers, both big and small, at least for C-like
languages; techniques based on recursive-descent (RD) are quite popular now.

On the "big, production-quality" side, gcc used a generated parser but moved
to a handwritten RD-based one, and Clang always used RD. EDG's front-end, used
in Intel's and other commercial compilers, is also handwritten RD. On the
small "toy compiler"/hobbyist/experimentation side, there's TCC/OTCC, CC500 (
[https://news.ycombinator.com/item?id=8576068](https://news.ycombinator.com/item?id=8576068)
), C4 (
[https://news.ycombinator.com/item?id=8558822](https://news.ycombinator.com/item?id=8558822)
), SubC ( [http://www.t3x.org/subc/](http://www.t3x.org/subc/) ), and many
others, all based on RD parsers.

In fact I can't think of any C compilers at the moment that are using a
generated parser/lexer...

~~~
thorn
Care to expose any good book explaining more modern approach for compilers
from the practical point of view (including RD)?

~~~
userbinator
Practical Compiler Construction:
[http://www.t3x.org/reload/index.html](http://www.t3x.org/reload/index.html)

This explains the SubC compiler I mentioned in the original post.

------
pjmlp
Great book! I have spent hours reading it multiple times soaking all
information I could.

It was one of the first books about compiler design that I got hold of, back
when Internet access was only available at the university.

------
jdswain
I had this book at University. It made a good companion text to our main
textbook, the dragon book. The dragon book covers a lot of ground, and is very
interesting, but Holub's book is much more practical, with everything
illustrated with real code examples (if my memory is correct). This was a big
help when learning and made compilers a lot more fun. It's well worth reading.

It's interesting that most of the content is still just as relevant today.
Coding styles change over time, and definitely the popularity of languages has
changed, but the theory is still just as useful.

Off topic, but we had the 'International' edition of this book. Many
publishers at the time did this, I'm not sure why. The explanation was that it
was to make it more affordable for us, although the prices were still high.
They were soft cover editions with plain covers (this book had a red cover
with only the title and author, in white). The linked page is the first time
I've seen the real cover. This is probably an early example of regional
pricing, much like DVD region codes. They could charge more in USA and less in
other territories I guess.

~~~
jindor
Mine was also the international edition, with dark blue cover. Not having been
a CS major I bought a copy out of curiosity on how a compiler is made, since
it seemed a lot more approachable than the famous dragon book. I admit I was a
bit overwhelmed at the amount of code it took to build a compiler. Glad to see
the book again in its entirety.

------
salimmadjd
His book the C Companion [1]

Really helped me get a deep understanding of C. Sadly, I haven't touched C for
such a long that I've forgotten most of it. I highly recommended it for anyone
who has a basic knowledge of C and wants to get deeper.

[1] [http://www.amazon.com/C-Companion-Allen-I-
Holub/dp/013109786...](http://www.amazon.com/C-Companion-Allen-I-
Holub/dp/0131097865)

------
amelius
Can somebody recommend a book that covers recent techniques implemented in
LLVM?

~~~
nickik
The good intro to compilers is Engineering a Compiler. It focuse more on
optimization then most intro books, and its all SSA.

If you want to go all in on SSA optimization, there is a book called Static
Single Assignment Book and its written by a hole list of compiler writers. Its
not finished but there is still a lot of information.

You can find it here:
[http://ssabook.gforge.inria.fr/latest/book.pdf](http://ssabook.gforge.inria.fr/latest/book.pdf)

Or you can go with the classic, Advanced Compiler Design & Implementation. See
here, [http://www.amazon.com/Advanced-Compiler-Design-
Implementatio...](http://www.amazon.com/Advanced-Compiler-Design-
Implementation-Muchnick/dp/1558603204)

All of them will teach you a lot about LLVM.

~~~
ltta
Speaking of SSA and optimizations, I just remembered that Andy Wingo (core
contributor to Guile scheme) has a treasure trove of great articles, e.g. a
fun intro to SSA [1] or CPS as used in Guile [2]. His blog is totally worth
just browsing around.

[1] [http://wingolog.org/archives/2011/07/12/static-single-
assign...](http://wingolog.org/archives/2011/07/12/static-single-assignment-
for-functional-programmers)

[2] [http://wingolog.org/archives/2014/01/12/a-continuation-
passi...](http://wingolog.org/archives/2014/01/12/a-continuation-passing-
style-intermediate-language-for-guile)

~~~
nickik
Yes. His posts are a great. He is working on a VM and I do to so I follow his
work closly.

------
zerr
How I love these 90s style book covers, color palettes :)

~~~
userbinator
I agree, there were more interesting/amusing cover illustrations than the
abstract, generic ones on the bulk of compiler and computing books today.
That's how the Dragon Book got its name too.

Related article: [http://www.globalnerdy.com/2007/09/14/reimagining-
programmin...](http://www.globalnerdy.com/2007/09/14/reimagining-programming-
book-covers/)

~~~
zerr
Speaking about covers, Forth Programmer's Handbook comes to mind :)

[http://www.amazon.com/Forth-Programmers-Handbook-3rd-
Edition...](http://www.amazon.com/Forth-Programmers-Handbook-3rd-
Edition/dp/1419675494)

~~~
petercooper
Check out this one: [http://ecx.images-
amazon.com/images/I/61wk1jiBkRL._SX258_BO1...](http://ecx.images-
amazon.com/images/I/61wk1jiBkRL._SX258_BO1,204,203,200_.jpg)

------
zem
I think that these days, if you're writing a compiler, it would be better done
in one of * c++ to leverage llvm (afaik c++ is the best way to do that) and
lots of existing code * some sort of ml, because ml is really good for lots of
the things compilers need to do * self-host, because the compiler may be the
best large-program workout your fledgeling language is going to get

------
norswap
In the same vein, see "lcc, A Retargetable Compiler for ANSI C":
[https://sites.google.com/site/lccretargetablecompiler/](https://sites.google.com/site/lccretargetablecompiler/)

------
slacka
What real world experience does Allen Holub have with compiler design? As far
as I can tell, he has not contributed to gcc, LLVM,or written any toy
compilers like TCC.

------
Ih8SF
why use c? its a horrific, logically incongruent lang

~~~
flatestcat
I know what you mean, man. Lots of n00bz, ricers, and fanboiz will claim C is
the fastest, most efficent lang because you can get "close to the metal" and
"tweak". In reality, any code beyond trivial complexity will benefit much more
greatly from algebraic rectification, which can only be done with certain
languages that are amenable to formal analysis.

~~~
userbinator
Until I see an award-winning 4k/64k demo written in one of these ultra-high-
level languages, I stand by my opinion that C is more efficient.

~~~
p0nce
Check your facts. A large majority of award-winning 4k/64k are written in C++.

~~~
10098
his point is still valid though

