
Small C Compilers - blacksqr
https://bootstrapping.miraheze.org/wiki/Main_Page#Small_C_Compilers
======
codezero
I love c4. The author writes very readable and simple code. I recommend trying
out swieros as well “a tiny Unix like kernel”:
[https://github.com/rswier/swieros](https://github.com/rswier/swieros)

~~~
snagglegaggle
[https://github.com/rswier/c4/blob/master/c4.c](https://github.com/rswier/c4/blob/master/c4.c)

Holy what

~~~
dTal
Instead of just gazing at the sea of terse variable names in awe, try actually
reading it! There's only about a dozen variables and they're all documented at
the top - the actual code is actually amazingly clear for what it does.
There's not a lot of gratuitous cleverness.

For example, just picking a random segment, you don't have to squint very hard
to see that this is a number literal parser:

    
    
      else if (tk >= '0' && tk <= '9') {
            if (ival = tk - '0') { while (*p >= '0' && *p <= '9') ival = ival * 10 + *p++ - '0'; }
            else if (*p == 'x' || *p == 'X') {
            [...goes on to handle the hexadecimal case...]
    

(aside - I love the conversion from string to decimal by subtracting the
string value of '0', as this will work for any text encoding where the decimal
digits are monotonic and contiguous - so ASCII and EBDIC at least...)

~~~
aasasd
> _I love the conversion from string to decimal by subtracting the string
> value of '0'_

That's how everyone did character arithmetic since forever, though, especially
with the letters. Wouldn't be surprised if it's in K&R. And it probably became
a subtle source of errors when environments changed, as mixing semantics and
implementation tends to do.

------
xelxebar
In this vein, and related to the Trusting Trust Attact section, I recently
came across an x86 project that works to boostrap from nothing, reling on zero
external software, even for compilation.

To that end it is organizes into "phases" where the lower levels bootstrap the
higher ones, with level zero essentially coding directly in x86 opcodes.

I really want to dig into it more, but for the life of me cannot find the
github page again. Any ideas?

~~~
xelxebar
Aaaaand, serendipidously, I just now stumbled upon it in an HN post a bit
below this one:

[https://news.ycombinator.com/item?id=21201413](https://news.ycombinator.com/item?id=21201413)

That article mentions the project I was thinking of:

[https://github.com/oriansj/stage0](https://github.com/oriansj/stage0)

Amazingly, it seems they are even aiming to boostrap some minimal hardware as
well! Super cool.

~~~
senorsmile
for some reason, I missed your reply to your own comment completely!

------
msclrhd
What about the Tiny C Compiler (TCC) that Fabrice Bellard wrote?
[https://bellard.org/tcc/](https://bellard.org/tcc/)

~~~
vkazanov
it not tiny anymore :-)

~~~
ainar-g
Could I ask you what do you mean by that?

~~~
tom_mellior
Not sure what the parent means by it, but here is lines of code data generated
using David A. Wheeler's 'SLOCCount'.

    
    
        SLOC Directory SLOC-by-Language (Sorted)
        36270   top_dir         ansic=35504,sh=460,perl=306
        28825   win32           ansic=28716,asm=109
        9692    tests           ansic=8806,asm=858,sh=28
        2395    lib             ansic=2252,asm=143
        158     include         ansic=158
        140     examples        ansic=140
    
    
        Totals grouped by language (dominant language first):
        ansic:        75576 (97.54%)
        asm:           1110 (1.43%)
        sh:             488 (0.63%)
        perl:           306 (0.39%)
    

As far as I can tell, the top-level directory is the actual compiler itself.
35k lines is pretty small for a real C compiler that can bootstrap GCC (the
versions that were still written in pure C).

~~~
vkazanov
What I mean is what I said :-) 70k is not tiny. And one cannot just throw away
a backend and say that it's not part of the compiler.

Compare it with a few others:

8cc/9cc - 10K LOC cproc - 7K LOC lcc - 30K LOC

~~~
tom_mellior
> And one cannot just throw away a backend and say that it's not part of the
> compiler.

OK. Let me rephrase: TCC doesn't have 35k lines of code, but with five
minutes' work it could be turned into a compiler with 35k lines of code
capable of bootstrapping GCC _on Linux_. That should be enough to compare it
to the ones you list.

~~~
vkazanov
That's an interesting way to put it. :-D I believe we could do that to a lot
of other compilers, say, lcc (30k) with its multiple backends and get
something like 10-20k.

Anyways, what I was trying to say is that "tiny" in "tinycc" has lost its
meaning already.

------
paulriddle
There is also [https://git.sr.ht/~mcf/cproc](https://git.sr.ht/~mcf/cproc)
inspired by several other small C compilers including 8cc, c, lacc, and scc. I
did not take a deep look at it yet, but it looks interesting.

------
guidedlight
I wonder why LCC isn’t mentioned.
[https://en.wikipedia.org/wiki/LCC_(compiler)](https://en.wikipedia.org/wiki/LCC_\(compiler\))

~~~
MrXOR
and [https://bellard.org/tcc](https://bellard.org/tcc)

------
andrewchambers
This one is fantastic:
[https://github.com/michaelforney/cproc](https://github.com/michaelforney/cproc)

------
jokoon
I've tried to use TCC to see if it's possible to use it as a scripting
language in a C++ project, and it seems to work pretty well.

(I've heard about chaiscript, but it seems way too big)

~~~
makapuf
Genuine curiosity, it seems smaller than tcc? Is it by loc, by runtime mem
used, by exec size ?

------
enriquto
There's also this amazing thing: [http://www.simple-
cc.org/](http://www.simple-cc.org/)

~~~
vector_spaces
Anyone know what they're using for the repo web UI? [http://git.simple-
cc.org/scc/](http://git.simple-cc.org/scc/)

~~~
naters
Pretty sure it's this:
[https://git.codemadness.org/stagit/](https://git.codemadness.org/stagit/)

------
Crinus
These look neat but they all seem to be targeting a subset of C (e.g. no
struct support) instead of the full thing (even if we include stdlib). Only
exception seems to be SmallerC (which as expected is larger than the other
compilers/interpreters that only support a subset).

~~~
rswier
I made a branch of c4 that includes struct support. I should probably add it
to the master branch so it gets more visibility:
[https://github.com/rswier/c4/blob/switch-and-
structs/c4.c](https://github.com/rswier/c4/blob/switch-and-structs/c4.c)

------
pepijndevos
Say I wrote my own CPU (which I did), which of those tiny C compilers would be
easiest to retarget?

My CPU is _very_ small, (~300 LUT4 on an FPGA using Yosys), but has a very
minimal ISA. It's mostly an accumulator machine with a stack pointer.

~~~
Rietty
I'm interested in what you mean by "wrote my own CPU". Could you elaborate
please?

~~~
pepijndevos
[https://github.com/pepijndevos/seqpu/blob/master/cpu.vhd](https://github.com/pepijndevos/seqpu/blob/master/cpu.vhd)

------
userbinator
Unfortunate that "CUCU", "C Interpreter", and "Small C for I386 (IA-32)" are
already dead links, although the page history shows that it was created only a
little over 2 years ago.

~~~
zserge
Sorry, blog layout has changed since then. [https://zserge.com/posts/cucu-
part1/](https://zserge.com/posts/cucu-part1/)

~~~
kragen
You might want to fix the links so the old URLs work again.

