
Ask HN: Suggestions for writing a new compiler? - chm
I am a chemistry grad student and will be attending a graduate course in CS at my university in the coming weeks. The course is called &quot;Programming languages and compilers&quot; and its goal is to teach students about compilers by making them write one.<p>As I have never ventured into the world of compilers, I&#x27;m a bit lost as to where to start. I&#x27;m familiar with C, Python, HTML, Mathematica, FORTRAN 77, all of which I&#x27;ve used at different times and for different purposes, but haven&#x27;t mastered any of them.<p>We (teams of 2) have the choice of either adding functionality to an existing compiler (suggestions include Gambit-C, Pascal-S, Tiny C, Small C) or write our own for any language or a subset thereof. If we choose to write our own compiler, it has to be written in Scheme, unless it can compile itself, in which case we can write it in any language. Coincidentally, I have begun reading Practical Common Lisp [0] last month and enjoy it.<p>The suggested textbook is &quot;A. Appel, Modern compiler implementation in Java&#x2F;ML&#x2F;C&quot;. Other suggested readings are by Paul Graham.<p>Now my questions are:<p>1) Should I write my own compiler or extend an existing one?<p>2) Should I write a self-compiling compiler?<p>3) What language should I try to compile?<p>4) What books&#x2F;resources will be helpful?<p>I&#x27;m asking these questions because I want to get the most out of this class. I think it&#x27;s a great opportunity but that I could easily get lost. Almost everything written in the course plan I had heard of or read somewhere, so I am not <i>completely</i> out of the game.<p>Thanks in advance.<p>[0]http:&#x2F;&#x2F;gigamonkeys.com&#x2F;book&#x2F;
======
Turing_Machine
If you decide to go with something in the Lisp/Scheme spectrum, you might find
the book Lisp in Small Pieces to be helpful.

If you decide to go with a non-Lisp language: are you allowed to use tools
like bison/yacc and lex/flex (or analogs for non-C languages)? Those can cut
down the amount of work by _a lot_. Making it self-hosting over the course of
semester is still going to be challenging, I think, especially if you have no
previous background in compilers and/or low-level code (there are a lot of
other issues there, such as the need to write or otherwise obtain an I/O
library).

If it were me, starting from ground zero, I'd either go with extending an
existing compiler or writing something in Scheme.

~~~
soegaard
"Lisp in Small Pieces" is _very_ well-written. It will teach you a lot about
how to compile (subsets) of Scheme.

If you decide to compile a Pascal-like language, I can't recommend "Brinch
Hansen on Pascal Compilers" enough. If I remember correctly the Pascal
compiler described in the book can compile itself.

------
inetsee
I don't know whether this would qualify for your class, but the Racket
documentation includes an implementation of Algol-60 "[http://docs.racket-
lang.org/algol60/"](http://docs.racket-lang.org/algol60/"). You might be able
to use this as the starting point of an implementation of another language,
maybe a subset of Algol-68, or Simula.

Algol-60 was the first programming I learned, and I've always been fascinated
by Algol and the languages derived from it.

------
AnimalMuppet
It seems to me that "can compile itself" is going to be extra work. That is:
You specify a language. You write a compiler for that language in that
language. But you can't compile the compiler, since you don't have the
compiler yet. So you have to write the compiler in some _other_ language that
already has a compiler.

Note that this does not apply if you are writing something like a C compiler,
because there are already C compilers out there.

------
marktangotango
That's really interesting, given your Chemistry focus, what has motivated you
to undertake this course? The requirement to write it in Scheme seems a bit
onerous to someone who's never used Scheme. Given that, it would probably
still be easier than extending an existing compler. I think you'd spend A LOT
of time learning some ones design and coding practices.

I always point people at this article. It's a nice short synapsis similar to
Crenshaws "Let's Build a Compiler" series only much shorter in length. Plus
it's Python, so may give you some ideas for Scheme:

[http://www.jroller.com/languages/entry/python_writing_a_comp...](http://www.jroller.com/languages/entry/python_writing_a_compiler_and)

~~~
chm
I (try my best to) do research in molecular electronics. I need to write
software to perform calculations. The reason I chose this particular course is
simple: I have no other choice. Either I already have taken the other
available classes or they aren't given in winter. My department doesn't offer
many graduate computational chemistry courses. Thanks!

------
X4
This can be of help:

1)
[http://en.wikipedia.org/wiki/History_of_compiler_constructio...](http://en.wikipedia.org/wiki/History_of_compiler_construction)

2)
[http://en.wikipedia.org/wiki/Category:Compiler_construction](http://en.wikipedia.org/wiki/Category:Compiler_construction)

3) CC500: a tiny self-hosting C compiler:

[http://homepage.ntlworld.com/edmund.grimley-
evans/cc500/](http://homepage.ntlworld.com/edmund.grimley-evans/cc500/)

------
porlw
Regarding 3, I would consider compiling a simple lisp - if you're coding in
scheme that will take care of the parsing, so you can concentrate on the code
generation side.

