

Clue 0.6 - C to Lua, Javascript, Java and Perl compiler - akavel
http://thread.gmane.org/gmane.comp.lang.lua.general/98036

======
david-given
Hello, author here.

Clue is an experiment and probably not useful for real work --- go look at the
memory model and you'll see why. (C89 allows for some _really_ weird but
standards-compliant architectures.) ints are about 56 bits wide and their
value is undefined on rollover, for example. Plus it's not finished; varargs
and switch are the big missing factor.

Regarding the node version: because I forgot to update the version number on
the website. It was actually 0.6.19. Updated. JS performance is heavily
penalised due to the aforesaid goto issue, though. (Procedural languages
without goto are _toys_ , dammit.)

Regarding C optimisation: yes, precisely. However this does put LuaJIT at an
unfair advantage since it can unroll loops to its hearts' content, while gcc
can't. It's probably worth rerunning at -O3 just to see what's different.

Incidentally, I have 2/3 of a Common Lisp backend (someone contributed a
backend but not the run-time library). Anyone want to complete it?

~~~
sedachv
Great work David, glad to see an update after almost 5 years!

I looked into Clue for compiling C to Common Lisp before writing
<https://github.com/vsedach/Vacietis> because I wanted something that would
interop better with CL types and be able to run self-contained.

------
killahpriest
One example of when goto is a good idea.

 _Why is Lua 5.2 so much faster than Lua 5.1? Lua 5.2 supports a new goto
keyword. This is incredibly useful when doing this kind of compilation as it
allows me to pass execution directly from basic block to basic block. Lua 5.1
doesn't have this, which means I have to fake goto using what boils down to a
switch statement. This is much less efficient._

~~~
Dylan16807
Looking at the source "boils down to a switch statement" appears to be a chain
of ifs. It makes me wonder how fast it would have been to use tail calls.

Also possibly worth noting is that lua 5.1 bytecode has gotos, if you're
willing to step down a level.

~~~
david-given
Clue 0.3 did precisely that. Unfortunately it required a patched Lua to
produce the special bytecode. I decided that that was cheating --- plus it
meant the Lua backend wouldn't generate code for LuaJIT.

Changing it to generate tail calls would be an excellent idea, but to do it
right would require a major rewrite of the backend (most of the code generator
is common code).

------
StavrosK
Am I understanding correctly? They compiled C to Lua and got comparable
performance with LuaJIT? That's amazing.

~~~
snogglethorpe
LuaJIT is amazing... :]

Keep in mind that the benchmark in question (whetstone) is at best a
microbenchmark, and doesn't necessarily reflect the performance of real
apps... Still, it demonstrates how well LuaJIT can nail this sort of tight
inner-loop code.

------
tezza
Looks good. I happen to need this soon-ish. I write a couple of cross platform
desktop apps in C.

There should eventually be a way to combine this with SDL to target Javascript
and canvas.

------
mscdex
I'm curious as to why they are still benchmarking against node 0.2.6, which is
quite old.

~~~
marios
I'm curious as to why they are benchmarking against code compiled with GCC -Os
instead of GCC -O3

~~~
hosay123
Almost everywhere (Linux kernel included IIRC) use -Os now as the effects of
the memory hierarchy are much more important than raw instructions per second.
A saved stall is worth hundreds of instructions

~~~
mich41
Linux defaults to -O2, -Os can be switched with CONFIG_CC_OPTIMIZE_FOR_SIZE.
Arch Linux doesn't enable it, dunno about others.

-Os isn't a silver bullet since it enables use of some high level instructions which may be implemented less efficiently on modern CPUs and reduces code alignment possibly causing some short functions or loops to span multiple cache lines.

------
martinced
GOTO and weird optimization _may_ be very useful in some circumstances but
there's a case where they'll _never_ be desirable: security.

I'm 100% that the future security-wise is stuff like esL4 (the L4 micro-
kernel, but which has been formally verified to be free from a _lot_ of common
mistakes typically leading to security exploits).

So before criticizing things as "toys" because they have don't have goto etc.
you have to realize that there are ends (security) that do justify quite some
means.

And don't get me wrong: I've done my faire share of 680x0 and 80x86 assembly
coding and just loved to be "in control" of everything. It gave me a sensation
of power.

But now I much prefer to look far ahead and dream about the days where we'll
be able to use provers on not just micro-kernels that are 7000 lines long (and
already on such a trivial number of lines already find _hundreds_ of potential
security exploits which have all been fixed) but also use provers on much
bigger programs.

So saying: _"I want to be able to modify a lookup table by accessing two bytes
as if they were a 16-bit word so that on the next pass I'll automagically JMP
to this place"_ (I'm just making that up) is, IMHO, a bit shortsighted when
considering the real problems we face today.

Most people have way enough power and totally underused computers (often with
many cores idling). The problem is hardly CPU perfs.

I'm not trading security for performance.

~~~
gngeal
You don't have to verify the code with gotos, you have to verify the
translator. After all, if you were serious about this, you'd have to stop
using most computers since they do actually run programs with JMPs in in.

