https://compilers.iecc.com/crenshaw/ circa 1995
And a new C version
Details are described in the book Write Your Own Compiler, http://t3x.org/t3x/book.html
The real difference between the compiler in the example, and mine, is that I handle floating-point operations and also explicitly use printf to show the output. (Because otherwise your return value is limited to an 8bit number.)
Too late to edit now!
Only on Linux; Windows lets you use the full 32 bits.
(returning ‘comment’ and ‘whitespace’ tokens would fix this, and would make it possible to reuse the lexer for pretty-printing/syntax coloring)
I’ll look into if there’s a better way to do it than this based on your suggestion, thanks!
> This limited both the language features that were possible and the quality of the produced code.
What are the limitations ?
Maybe more exotically, here is a legal Haskell program:
compute = f x y
x = 3
y = 4
f = (+)
Interestingly, this restriction does not apply to goto labels, which you can use first and define later: The compiler can just emit the label as written into the assembly code and let the assembler worry about patching up the jump target.
Pascal also has that restriction
Although Delphi does not have it anymore. But FreePascal still has it, even in Delphi compatible mode. The FreePascal developers have also said, they will keep that restriction to improve readability. The code could not be read anymore if variables were placed willy-nilly everywhere
A one-pass compiler can manage this by maintaining a mapping of undefined labels to a list of goto statements that refer to them. Once the definition is located, unwind the list and fill in the jump values. Types are trickier because the size is unknown.
>"An optimizing compiler would constant-fold our expression into a single number ahead of time."
Could someone say what is meant by constant-folding here?
3 + 4
* If there was a "push int"
* And another "push int"
* And a maths operation.
* Then replace.
So in my case I'd first generate the naive version, then have a separate "optimise" pass which would rewrite the program:
There were a couple of bugs found along the way, but the actual implementation of this optimization pass was pretty simple:
The biggest issue I had to face was that "1 + 3 => 4" is trivial, but my virtual machine only supports loading integers. So I couldn't collapse "2 * 3.3" into the constant "6.6".
Later I do walk back over the bytecode and remove the nops. But I have to update all JMP/CALL instructions to cope with the changed destination offsets. Not a hard job in my case, as there are only a couple of instructions which refer to byte-offsets.
(If I allowed "LD A,[BC]", or similar permuations I'd have a lot more work to do.)
Ex: 4 + 4 would be condensed down to just 8 at compile time rather than writing the code to get 4 and 4 both into registers and add them at execution time.
i = 320 * 200 * 32;
Say, F# compiler is close to single-pass, I believe, producing CIL, which goes to JIT/AOT compiler later