
Architecture for a JavaScript to C compiler - timruffles
https://timr.co/architecture-for-a-js-to-c-compiler
======
snek
I don't understand why people keep making js-to-native compilers. the result
is always less performance that what modern js engines can do to the original
source code, because all those runtime behaviours the author mentions build up
really fast. js engines actually run the code and figure out the types and
such to create highly optimized machine code.

~~~
ridiculous_fish
Modern JS engines rely on tracing JITs to achieve good performance. These
optimizations only kick in once code has executed multiple times.

But a lot of JS code executes only once, such as layout code running at app
launch. Heck, lots of JS code executes _zero_ times. This code still imposes a
cost (parsing, etc).

Consider the complaints about app launch time of Electron apps. A static
compiler can be more effective at the runs-once or runs-zero cases.

~~~
pizlonator
Not tracing. Speculation.

The difference is that you compile user control flow as-is unless you have
overwhelming evidence that you should do otherwise.

And yes you are right. This is promising for run-once code, but all of the
cost will be in dynamic things like the envGet. An interpreter can actually do
better here because most environment resolution can be done as part of
bytecode generation. So it’s possible that this experiment leads to something
that is slower than JSC’s interpreter.

~~~
ridiculous_fish
Agreed that envGet() will be brutal, but optimizations can eliminate a lot of
that, e.g. by promoting locals to stack variables instead of heap variables.

It is possible to statically decide which environment contains each upvar. So
I don't understand your conclusion that this experiment may be slower than
JSC's bytecode interpreter. What information can JSC exploit at this stage
that is statically unavailable?

~~~
arcticbull
Yeah, and escape analysis could be used to avoid putting variables that can't
be easily placed onto the stack in the scope of a garbage collector. Further
optimization work could be performed to compute variables with constant
results or to partially compute expressions. It's interesting, got me thinking
about building one.

~~~
arcticbull
I got to thinking... with the right types, isn't it possible to take each JS
function and emit a Rust generic function over every JS type (value/object)
and have the Rust compiler emit optimized code for each specialization. Each
invocation at compile time can be optimized by the Rust compiler then and so
long as there's no eval or dynamic dispatch in the translation unit, each call
should be maximally efficient and all unused paths can be trimmed statically.

------
truth_seeker
I wonder how much optimization will it bring to compared to existing JS
runtimes such as V8.

Thanks to competing world for web browsers, JS runtimes not only efficiently
parse to optimized native code but also provide really good JIT compilation
benefits.

Speculative optimization for V8 - [https://ponyfoo.com/articles/an-
introduction-to-speculative-...](https://ponyfoo.com/articles/an-introduction-
to-speculative-optimization-in-v8)

Parallel and Concurrent GC - [https://v8.dev/blog/trash-
talk](https://v8.dev/blog/trash-talk)

Good summary on 10 years of V8 -
[https://v8.dev/blog/10-years](https://v8.dev/blog/10-years)

~~~
ridiculous_fish
v8-style engines do not parse to optimized native code.

As described in the links, v8 parses to an AST, which then is compiled to
bytecode. A bytecode VM then executes the JS, collecting runtime type
information, which is input (along with the bytecode itself) into the next
compilation tier; only at that point is machine code generated.

The key idea is that v8 expects to execute the JS code before it can generate
native code. It won't generate native code from parsing alone.

------
rkeene2
A lot of research and hard work has gone into TclQuadCode [0], which compiles
Tcl (which is even more dynamic than JavaScript) into machine code via LLVM.

The authors indicated at one point it took around 5 PhDs to get it going.

[0]
[https://core.tcl.tk/tclquadcode/dir?ci=trunk](https://core.tcl.tk/tclquadcode/dir?ci=trunk)

~~~
ridiculous_fish
This is bizarre and fascinating. I had no idea there were Tcl codebases of a
size that could benefit from this sort of perf work. How much Tcl is out
there?

------
ndesaulniers
I actually think this is possible; and started prototyping it (because esprima
is awesome, and not to many other languages have an equivalent that's so easy
to use).

Some thoughts: I think it's easier to target C++ than C, since C++ can help
you write more type generic code. I think it's easy to generate tagged unions,
then for optimizations try to prove monomorphism. Finally, it may be simpler
to start off with support for typescript, and fail to compile if there are any
ANY types. I do think it's possible though. JS/TS -> C++ -> WASM (yes, I was
out of my mind when I thought of this)

~~~
johnhenry
Isn't this kind of what V8 and other modern JavaScript engines do on the fly
already?

~~~
mikece
Yes: one of the Google engineers working on V8 talked about it here:
[https://softwareengineeringdaily.com/2018/10/03/javascript-a...](https://softwareengineeringdaily.com/2018/10/03/javascript-
and-the-inner-workings-of-your-browser/)

It was this conversation that make me wonder if at some point in the future V8
might have experimental support natively for TypeScript but it makes more
sense that compiling web assembly to native binary would make more sense. Who
knows? It's an awesome time to be a programmer!

------
maxxxxx
With all the dynamic stuff Javascript has it seems really difficult to create
performant C code.

There is a PHP to .NET compiler which probably has similar problems. On second
thought that one is probably easier because .NEt has a dynamic runtime.

~~~
nicoburns
That, and in practice modern PHP is often relatively staticly typed (classes
declare their fields, etc), and many codebases even include type annotations
(which are supported in PHP and usually enforced at runtime).

------
tannhaeuser
Congrats to completing this project. What's the status and further plans for
it? I didn't find a license.

------
ridiculous_fish
How are exceptions handled with this design? For example the `n < 3` may throw
(it invokes the valueOf method).

~~~
timruffles
I went for longjmp(), non-local goto, for precisely the reason you
highlighted: I realised pretty well everything in JS can throw, so needed
something easy to trigger from anywhere

See [https://github.com/timruffles/js-
to-c/blob/1befbf4220753576e...](https://github.com/timruffles/js-
to-c/blob/1befbf4220753576e2d08c3607ab445c3ddad9ea/runtime/exceptions.c)

------
maxgraey
I wonder why people still try write transpiler from JS to C/C++ or LLVM (which
make sense at least). But this not performant way and usually produce much
bigger overhead than jit vm which use speculative optimizations.

Some projects: 1\.
[https://github.com/fabiosantoscode/js2cpp](https://github.com/fabiosantoscode/js2cpp)
2\. [https://github.com/raphamorim/js2c](https://github.com/raphamorim/js2c)
3\. [https://github.com/ammer/js2c](https://github.com/ammer/js2c) 4\.
[https://github.com/NectarJS/nectarjs](https://github.com/NectarJS/nectarjs)
5\. [https://github.com/ovr/StaticScript](https://github.com/ovr/StaticScript)

