
c4x86 - JIT compiler for x86 in 86 lines - dmytrish
https://github.com/EarlGray/c4
======
userbinator
This is amazingly elegant and concise. I've heard a lot of people, including
the compiler academics, claim that even the simplest non-optimising compiler
for x86 would be extremely complex; this is a great counterexample to that,
alongside (O)TCC. It looks like it's using mainly two registers only for code
generation, eax and ecx, and the stack.

Jack Crenshaw's "Let's build a compiler" tutorial series uses the same code
generation method but with eax/edx ([http://www.pp4s.co.uk/main/tu-trans-comp-
jc-intro.html](http://www.pp4s.co.uk/main/tu-trans-comp-jc-intro.html) ); not
sure if he was the first one to come up with it, however. Incidentally, I
think his tutorial series is one of the best ones I've read for understanding
how compilers work, even better than all the "traditional" literature like the
Dragon Book.

The other article that'll probably be very helpful for those understanding how
the rest of this compiler works is
[http://www.engr.mun.ca/~theo/Misc/exp_parsing.htm#climbing](http://www.engr.mun.ca/~theo/Misc/exp_parsing.htm#climbing)
which describes the "precedence climbing" way of parsing expressions.

Compilers are often thought of as "magic" that only very few people can
understand, but I don't think that has to be true in general. While
"production-quality" compilers like gcc are certainly very complex, the
_essence_ of a compiler is really quite simple. However, this wouldn't be the
impression one gets by reading a lot of the compiler books out there --- the
heavy emphasis on theory in the majority of them has a somewhat obfuscating
effect. They also tend to focus a lot more on compiler-compilers (Lex, Yacc)
and more general parsing techniques like LR, while spending very little on the
simple methods of syntax-directed translation and recursive descent. AFAIK
precedence-climbing is not even described in the Dragon Book.

~~~
abecedarius
I've never seen precedence climbing in _any_ compiler textbook, unless you
count the lcc book. (I learned of it instead from an unconventional algorithms
class by Dave Gillespie, who wrote Gnu Calc.) You can see how it gets squeezed
out of the curriculum since you can write a conventional parser just fine
without it, but it's a loss -- when you see it how can you not go "Oh yes,
that's sweet, that's just right"?

Crenshaw's series turned me off when the first parser he showed would
incorrectly accept incorrect inputs, and it could've been done rightly just as
easily -- or so it looked to me. I did not test it or read further, and since
it's so often recommended I seem to've missed something.

P.S. precedence climbing in Python:
[https://github.com/darius/sketchbook/blob/master/parsing/pre...](https://github.com/darius/sketchbook/blob/master/parsing/precedence_climbing.py)
. Pratt's top-down operator precedence parsing is essentially an object-
oriented way to express the same algorithm:
[https://github.com/darius/sketchbook/blob/master/parsing/pra...](https://github.com/darius/sketchbook/blob/master/parsing/pratt.py)

------
afandian
Previous discussion on C4:
[https://news.ycombinator.com/item?id=8558822](https://news.ycombinator.com/item?id=8558822)

Looks like this is a fork of that.

~~~
tptacek
That discussion makes me a little sad for HN. Although 'userbinator has a
fantastic comment in there.

~~~
sitkack
link to userbinator's comment
[https://news.ycombinator.com/item?id=8560974](https://news.ycombinator.com/item?id=8560974)

------
chubot
What prevents c4x86.c from compiling with c4.c? Is it the <sys/mman.h> header?
(Note the "not self-hosted" comment in the README.)

~~~
dmytrish
It's `mmap` call and syntax of function pointer at the end of the file (needed
to call the compiled code).

P.S. Otherwise, it is ready to run itself, work in progress.

~~~
userbinator
Allowing it to call arbitrary functions using dlopen()/dlsym() would be the
idea I have for fixing the first problem - it'd also increase its power
substantially since it can then make use of any other library's functions
instead of the currently hardcoded list. Supporting function pointers would be
a bit harder and involve changing the parser/code generator.

~~~
hyc_symas
How would you tell it which shared library to dlopen?

Was looking into changing the order of arguments to go the right way 'round.
Also changing address refs to be PC-relative, to avoid the relocation phase.
Seems pretty straightforward.

~~~
dmytrish
I thought about `dlopen`, but did not know how to fetch it libc platform-
independently (`dlopen(argv[0])` is fine on OS X, but not on Linux).

Now I see that the first argument can be NULL, that approach is feasible.

------
joebo
This greatly helped my understanding of how compilers work.

How different would the implementation be for x64? Would it primarily be a
matter of translating opcodes and calling conventions? For example, pop ecx is
59 [1], and on x64 it looks like it would be pop rcx, which is also 59 [2][3].

[1] -
[http://sparksandflames.com/file/x86InstructionChart.html](http://sparksandflames.com/file/x86InstructionChart.html)

[2] -
[https://defuse.ca/online-x86-assembler.htm#disassembly](https://defuse.ca/online-x86-assembler.htm#disassembly)

[3] - [http://msdn.microsoft.com/en-
us/library/windows/hardware/ff5...](http://msdn.microsoft.com/en-
us/library/windows/hardware/ff561499%28v=vs.85%29.aspx)

~~~
userbinator
For x64 under both *nix and Windows sizeof(int) remains at 4 while the sizes
of pointers increase to 8 bytes (64 bits), so anything in the compiler that
depends on those sizes would also have to change. As for the opcodes, with
x86/x64 it's a little weird since 64-bit mode actually defaults to 32-bit data
sizes and uses mostly the same opcodes as 32-bit mode, with a prefix byte
needed to override this to 64 bits. Addresses default to 64 bits, however. The
32-bit environment is "nice" in that pointers and integers are the same size;
not true for most 64-bit in general
([http://en.wikipedia.org/wiki/64-bit_computing#64-bit_data_mo...](http://en.wikipedia.org/wiki/64-bit_computing#64-bit_data_models)
)

But still, trying to fit the x64 version in 64 lines is probably going to be
the harder problem. ;-)

------
joebo
This also works on win32 under mingw32. I forked and added support for it:
[https://github.com/joebo/c4/blob/master/c4x86.c](https://github.com/joebo/c4/blob/master/c4x86.c)

------
jrockway
By 86 lines, do they mean 542 lines?

~~~
dmytrish
I meant the code generator size: it starts at line 448 and ends at 535, so its
size is about 86 lines. The rest of the code is taken from original c4 and is
not changed.

~~~
tomcam
..and SNAP! Great work, dmytrish. This is a godsend to people trying to fight
their way through traditional compiler texts. Thanks for a truly impressive
contribution to the language dev world.

~~~
dmytrish
Thank you. rswier's c4 was a great inspiration for me. I am glad If my humble
two cents are helpful for somebody.

------
haberman
If you enjoy this sort of thing you might enjoy my Brainfuck micro-JIT:
[http://blog.reverberate.org/2012/12/hello-jit-world-joy-
of-s...](http://blog.reverberate.org/2012/12/hello-jit-world-joy-of-simple-
jits.html)

------
sitkack
Here is another example of a microjit, this time in Python
[https://gist.github.com/leegao/1073233](https://gist.github.com/leegao/1073233)

------
cogburnd02
Is the upshot of this sort-of like 'tcc -run'?

~~~
dmytrish
At a quick glance, yes, `tcc -run` seems to output machine code into memory
for running it, so yes.

~~~
cogburnd02
Cool. Is that all that's required to use it instead of this?

    
    
      #!/usr/bin/tcc -run
    

Or does doing that require weird stdin-related Magic?

------
bithush
That is some beautiful code.

