
Resources for Amateur Compiler Writers - rspivak
http://c9x.me/compile/bib/
======
senko
IMHO, compiler construction as an advanced excercise for amateurs is at topic
that has been beaten to death (as OP suggests, there's tons of available
materials and projects ranging from high quality not-so-amateur to quick or
fun hacks - I'm guilty of one myself).

On the other hand, I would love to see "HTML5 and CSS parsing and rendering
for amateurs". Given the state of modern HTML5 and CSS standards, and ignoring
compatibility and real-world usage (just like for toy compilers), Let's Build
A Browser Engine sounds more tempting than Let's Build a Compiler.

(To preempt "contribute to existing actual real-world engine" suggestions --
while that's worthwhile, it's like saying "contribute to LLVM" to someone
looking to write a toy compiler, ie. completely misses the point).

~~~
stijlist
There's this great series by mbrubeck which sounds like it might be right up
your alley: [https://limpet.net/mbrubeck/2014/08/08/toy-layout-
engine-1.h...](https://limpet.net/mbrubeck/2014/08/08/toy-layout-
engine-1.html)

~~~
senko
This looks like a great resource, thanks!

~~~
creatio
This also might be interestings:
[http://www.html5rocks.com/en/tutorials/internals/howbrowsers...](http://www.html5rocks.com/en/tutorials/internals/howbrowserswork/)

------
zzzcpan
Frankly, I don't see anything interesting in that list, especially for
amateurs.

As an amateur compiler writer you would probably want to make something useful
in a few weeks, not waste a year playing around. And it's a very different
story. It's essentially about making a meta DSL, that compiles into another
language and plays well with existing libraries, tooling, the whole ecosystem,
but also does something special and useful for you. So, you should learn
parsing, possibly recursive descend for the code and something else for
expressions, a bit about working with ASTs and that's pretty much it.

------
PaulHoule
Is amateur the right word?

I am in it for the money which I guess makes me a pro but I don't have a
computer science background and frankly in 2016 I am afraid the average
undergrad compiler course is part of the problem as much as the solution.

Another big issue is nontraditional compilers of many kinds such as js
accelerators and things that compile to JavaScript, domain specific languages,
data flow systems, etc. Frankly I want to generate Java source or JVM byte
code and could care less for real machine code.

~~~
reymus
"frankly in 2016 I am afraid the average undergrad compiler course is part of
the problem as much as the solution."

What do you mean by that?

~~~
tikhonj
I'm not the OP, but I sympathize. The specific details covered in a
"classical" compilers course are heavy weight and not super-relevant right
now. These days you don't have to understand LR parsing or touch a parser-
generator, you don't have to worry about register coloring... etc. Courses
still use the Dragon Book which is _older than I am_ and covers a bunch of
stuff only relevant to writing compilers for C on resource-constrained
systems.

Instead, I figure a course should cover basics of DSL design, types and type
inference, working with ASTs, some static analysis and a few other things.
That has _some_ overlap with a traditional compilers course, but a pretty
different focus.

~~~
Drup
So, the TAPL ? :)

[https://www.cis.upenn.edu/~bcpierce/tapl/](https://www.cis.upenn.edu/~bcpierce/tapl/)

~~~
catnaroek
Not really. TAPL is a very useful book, but it won't teach you how to write a
compiler, unless the only part of a compiler you actually care about is the
type checker. The interpreters it describes (in the chapters titled “An ML
implementation of <whatever>”) are ridiculously inefficient.

~~~
nickpsecurity
You have a link to a good guide for beginners on designing and efficiently
implementing type checkers?

~~~
catnaroek
This is a nice introductory tutorial on how to implement Hindley-Milner type
inference: [https://github.com/jozefg/hm](https://github.com/jozefg/hm)

This is a more advanced tutorial that illustrates a nice but tricky
optimization that OCaml's type checker internally uses:
[http://okmij.org/ftp/ML/generalization.html](http://okmij.org/ftp/ML/generalization.html)

Finally, TAPL's type checkers are pretty good. They aren't designed for
efficiency, though. They're designed to closely follow the book's contents:
[http://www.cis.upenn.edu/~bcpierce/tapl/checkers/](http://www.cis.upenn.edu/~bcpierce/tapl/checkers/)

~~~
nickpsecurity
Thanks for the links!

------
johan_larson
Compiler construction is a big field, so it's easy to get lost in the details.

If you are mostly interested in principles rather than the most recent
tooling, there's a course by Wirth that makes it tractable.

More here: [http://short-sharp.blogspot.ca/2014/08/building-compiler-
bri...](http://short-sharp.blogspot.ca/2014/08/building-compiler-briefly.html)

------
qwertyuiop924
If you're interested in getting started with interpreters, which are easier,
you might want to look into Daniel Holden's excellent _Build Your Own Lisp
(And Learn C)_. Although it has been criticized for many reasons, it's a great
book, and if you find interpreters and compilers totally magic, it's a good
place to start.

Also, after reading _What every compiler writer should know about programmers_
, I finally understand why people hate C. Because this just shows definitively
that C compiler writers have been in their own little world for the past few
decades.

Man, now I want a C compiler that wasn't written by a bunch of mindless jerks
that will be the first up against the wall when the revolution comes...

------
_RPM
I wrote a VM, I still can't get recursion to work. It's hard.

~~~
chrisseaton
I never understood why recursion causes anyone any problems, because recursion
is the absence of a special case limitation.

If I tell you that a function may call any function, then you already know
everything you need to know for recursion. If we didn't have recursion, only
then would I need to qualify what I just told you with the restriction that a
function can only be active once.

When I show students recursion I can't understand their confusion. I think to
myself 'but I already showed you functions can call any other function, why do
you see this case differently?'

(Obviously I try to be more patient, understanding and anticipatory in
person.)

~~~
DigitalJack
I'll share why I struggled with recursion (as best as I remember):

Most programming I learned was imperative. As I wrote the code, I imagined the
execution in my head. This led to a problem where when I was halfway through a
function, and it referred to itself, my brain would segfault. How could
something I was not yet done writing refer to itself? I hadn't finished yet,
so my brain could not comprehend what such a reference would mean.

It was also difficult because I wanted to think of functions as nice neat
pieces of code that would take some data, do it's thing, and return a value. I
could mentally inline a function call without much effort.

But when recursion is introduced, the floor drops out of my mental inlining.
Suddenly my mental effort for such things becomes huge. For me anyway. My
brain doesn't like to float around in abstraction land for long, it needs to
periodically be anchored in the concrete. Otherwise I quickly lose my sense of
direction and orientation... I lose my context.

Declarative languages actually make these easier, for me, because I'm not
mentally executing a recipe as I write. I am giving a piece-wise description
of _what_ something is. So there is no mental tracing.

I expect the a-ha moment for recursion is different for everyone. But just
showing recursion in many different forms would probably help. For example,
show fib sequence generation where instead of the function calling itself,
each function calls a uniquely named function... such that you concretely
demonstrate building the first 5 or so numbers in the sequence...

Then show the similarity of the functions, show what the computer has to keep
track of with the nested function calls, and step by step work your way to
straight recursion.

Show it in BASIC with GOTO statements.

Finding as many different ways to concretely demonstrate an abstract concept
will help reach more people.

~~~
akkartik
As someone who enjoys teaching programming[1], your comment was my favorite of
the day[2]. My learning experience was similar to yours, starting with
imperative languages, so you got me to think about how I think about recursion
today, given our shared baggage:

1\. I do still inline recursive functions, but I've learned to selectively
inline just the base case when I first implement recursion. Paradoxically,
experience with lisp helped me with recursion in C, particularly Common Lisp's
_trace_ facility which taught me to visualize stacks of multiple function
calls (whether recursive or not) rather than a single one at a time.

2\. I've learned to think declaratively even when I program in C. When writing
a C function I might start out with a crisp definition in my head ("this
function saves the reverse of the list seen so far in its second argument") so
that I can rely on that definition even when the implementation isn't yet
complete.

[1] [http://akkartik.name/post/mu](http://akkartik.name/post/mu)

[2]
[https://news.ycombinator.com/favorites?id=akkartik&comments=...](https://news.ycombinator.com/favorites?id=akkartik&comments=t)

------
jfoutz
Dybvig's dissertation is great. [1] People might disagree that it's a
compiler, it targets a fairly high level vm rather than a native machine. But
it's got everything you need. Really, you can fire up dr racket, type it in
and have a great framework in an afternoon.

Anyway, it's very readable.

[1] [http://agl.cs.unm.edu/~williams/cs491/three-
imp.pdf](http://agl.cs.unm.edu/~williams/cs491/three-imp.pdf)

------
barrkel
If you're more interested in the front end than the back end, then Crenshaw's
Let's Build a Compiler is still worthwhile.

------
cocoflunchy
I recommend [http://createyourproglang.com/](http://createyourproglang.com/)
too if you want something very simple and you don't know where to start.

~~~
Silhouette
Unfortunately that page seems to make a lot of big claims (and unnecessarily
insult a lot of established work in the field) but seems to include literally
no useful information about the book at all: no table of contents, no
indication of who the target audience is or what prior experience is assumed,
not even a summary of the topics it covers.

------
poseid
nice collection - i keep some notes myself here, and was able to generate my
own parser with Jison
[https://github.com/mulderp/mulderp.github.com/issues/13](https://github.com/mulderp/mulderp.github.com/issues/13)

Once the parser returns the AST, it is getting more complicated, how to
decorate an AST, add actions, etc. still looking to learn more about compiler
backends

------
Ind007
Is there any similar kind of collection for static analysis?

~~~
ericbb
Matt Might's site has some pages on this topic:

[http://matt.might.net/articles/books-papers-materials-for-
gr...](http://matt.might.net/articles/books-papers-materials-for-graduate-
students/#analysis)

[http://matt.might.net/articles/intro-static-
analysis/](http://matt.might.net/articles/intro-static-analysis/)

[http://matt.might.net/articles/partial-
orders/](http://matt.might.net/articles/partial-orders/)

