

Cucu: a compiler you can understand - LiveTheDream
http://zserge.com/blog/cucu-part2.html

======
p4bl0
It is a very good idea to write a small compiler for teaching purpose. However
I think it would have been better to have clean code where the reader can
understand what happens at each line. Here the massive usage of global
variables makes it harder and unintuitive. Also there are a lot of functions
which returns for instance 0 or 1 to signal their success or failure but then
at the call sites their return values are always ignored. Also, the way the
∗_expr functions works in the part 2 is really, really weird and unintuitive
(but this is due to the all-in-global-variables design). I could go on with
other issues, but all this is to point out the big caveat: one should not take
the code of the article and try to grow the compiler to extend it because
doing so _will_ lead to horrible spaghetti code. And that really is too bad,
because it would have been an even better exercise for beginners to extend
such a small compiler.

------
scorpioxy
Very cool. I think more articles like this, about how a compiler is not magic,
should be written.

It still amazes me how many programmers I meet think that a compiler is
something that they could never write. I say a production compiler might be
too much work for just one person, but in its basic building blocks it's
actually relatively easy to write.

I say this after trying to write one just to overcome the mystery, even though
I probably won't need to ever write my own professionally.

You can find my attempt at: <http://www.codedemigod.com/jack-compiler/>
<https://github.com/alaasalman/jackcompiler>

------
paupino_masano
Interesting concept, though I'm not sure quite what they're trying to achieve.
Are they trying to teach lexical analysis? Their `typename` method seems
awfully verbose for my liking - in my opinion, definitely harder than the
LALR(1) syntax that CUP promotes.

Personally, I learnt how to write compilers using a mix of FLEX (the Fast
Lexical Analyzer - not by Adobe!) and CUP. From that you create a LALR(1)
grammar which then compiles down to a tree (if you choose) which you traverse
depth first generating code/assembly/whatever. I would say that this is
actually EASIER to write and to understand than trying to do it from scratch -
you concentrate purely on the grammar and the tree generation (which leads to
code generation or interpretation - e.g. BASIC, code analysis etc).

This MAY seem complex as there is a bit of "black box" going on behind the
scenes (how does it lexically turn characters into symbols? How does it turn
those symbols into a grammar which eventually builds a working product?)
however once you understand the grammar language (lexical analysis is easy)
you find that compilers aren't that difficult. It's a matter of turning code
into symbols which you then apply a grammar to. From there (tree/code
generation/interpretation) is easy. I would in fact put lexical analysis and
grammar generation into a separate toolkit which a compiler developer _uses_ -
though it may be from my experience: I welcome other compiler developers to
interject.

Learning how to write a compiler definitely made me a better programmer -
especially in terms of OO languages and also understanding how languages are
built (i.e. learning new languages aren't an issue when you know how the
internals are likely to work). I highly recommend it for anyone looking to
improve their skill set. Unfortunately the course I took at university (which
WAS available free on the internet) teaching this concept has now died. It's
quite sad that it has disappeared as that paper was hands down my most useful
paper I ever took. I've now written compilers for companies converting legacy
code to modern code (and native etc) from these skills - it's much more
versatile than just generating assembly/machine code!

Anyway, I digress: I admire what they are trying to do, however I would
recommend others to rather learn FLEX (or JFLEX, CSFLEX etc) and CUP (CSCUP
etc) instead of trying to do all the heavy lifting themselves. If they want to
write a lexical analyzer or a grammar parser then that is a different
journey...

------
hvs
Doing something like this _before_ tackling the Dragon book [1] is a great
idea. Formal theory will make a lot more sense once you've run into some of
the practical problems yourself.

[1]
[http://en.wikipedia.org/wiki/Compilers:_Principles,_Techniqu...](http://en.wikipedia.org/wiki/Compilers:_Principles,_Techniques,_and_Tools)

------
dlo
Great work!

Might I offer a suggestion? I would argue that most of the work that goes into
a compiler is not the front-end functionality but rather the "middle-" and
back-end functionality, i.e. making the generated code fast. It would be very
insightful to those outside the compilers community as to what it's actually
like to work on one to go over the basic optimizations, such as global data-
flow analysis (reaching definitions, live variables, partial redundancy
elimination, constant propagation, etc.), loop unrolling, and so on.

~~~
paupino_masano
I'd tend to disagree: I think fundamental language design is very important.
Optimizations after the language tree has been built are an entirely different
concept. Important nonetheless, but different to perhaps what the author is
trying to achieve.

Personally, I think he is going about teaching it in a non-trivial manner
making such things as you mentioned above more complex than they perhaps need
to be. But, perhaps that's just how I've been taught how to write compilers...
fundamentals often are the hardest to "un-learn".

~~~
dlo
It seems to me that there are a lot of efforts around making the front-end
easier to understand. I think there needs to be at least _one_ project that
touches upon the back-end in a way that doesn't trivialize it.

There is not even _one_ similar project for the back-end that I am aware of.
And it's a shame, because I think the back-end is way cooler than the front-
end.

------
ecoffey
This is a really great article and really similar to the compilers class I got
to take in college.

Removing left-recursion from a grammar by hand! Woo!

------
jiyinyiyong
Like it though I can only understand part of it. It's really nice to see
tutorials about compilers.

------
borplk
The site seems to be down

~~~
robert_nsu
The site was down for me also.

On a side note - The first thing I thought about when I saw the name was
copper.

