

How JavaScript compilers work - wx196
http://creativejs.com/2013/06/the-race-for-speed-part-2-how-javascript-compilers-work/

======
rayiner
The terminology in the article is somewhat idiomatic: "After the Directed Flow
Graph (DFG) or syntax tree has been generated the compiler can use this
knowledge to perform further optimisations prior to the generation of machine
code. Mozilla’s IonMonkey and Google’s Crankshaft are examples of these DFG
compiler."

"DFG" usually refers to some sort of dependence graph. What IonMonkey and
Crankshaft have is a traditional CFG (control flow graph) in SSA form. That
means the code is represented as a set of straight line code paths (basic
blocks) connected edges in a graph representing possible control-flow
transitions. The instructions are in SSA form, which means that each variable
is assigned only once and pseudo-instructions called "phi instructions" are
used to merge results when a basic block is accessible from multiple
predecessors blocks, both of which might assign the same variable. A "syntax
tree" is different yet. It's a higher level representation that preserves the
syntactic structure of the code. For example, it represents loops and "if"
statements as nested elements in the tree. Translation to a CFG throws away
most of that information (e.g. reducing loops to simply control flow edges in
the CFG).

For a (relatively) approachable set of materials on the subject, see:
[http://www.cs.rice.edu/~keith/512/2011/Lectures](http://www.cs.rice.edu/~keith/512/2011/Lectures).
I also highly recommend Cooper & Torczon's "Engineering a Compiler" (2d Ed.)
In a world of CS researchers that can't write an English sentence to save
their lives, Keith Cooper and Linda Torczon's work is a paragon of clear
prose. (Their lab at Rice is where Cliff Click got his PhD, and if you've read
his work on the JVM, you know that he too has a talent for explaining
complicated concepts in plain English.)

~~~
joshuacc
"The terminology in the article is somewhat idiomatic"

Do you mean idiosyncratic? Not trying to nitpick. I was genuinely confused
until I concluded that it must be a typo.

~~~
rayiner
Sorry, yes, idiosyncratic. "DFG" is the term used in the webkit js engine.

------
deweerdt
The 4 parts:

\- The JavaScript family tree: [http://creativejs.com/2013/06/the-race-for-
speed-part-1-the-...](http://creativejs.com/2013/06/the-race-for-speed-
part-1-the-javascript-engine-family-tree&#x2F);

\- How compilers work: [http://creativejs.com/2013/06/the-race-for-speed-
part-2-how-...](http://creativejs.com/2013/06/the-race-for-speed-part-2-how-
javascript-compilers-work&#x2F);

\- JavaScript compiler strategies: [http://creativejs.com/2013/06/the-race-
for-speed-part-3-java...](http://creativejs.com/2013/06/the-race-for-speed-
part-3-javascript-compiler-strategies&#x2F);

\- The future for JavaScript: [http://creativejs.com/2013/06/the-race-for-
speed-part-4-the-...](http://creativejs.com/2013/06/the-race-for-speed-
part-4-the-future-for-javascript&#x2F);

~~~
imissmyjuno
Looks like the server is under fire, and Google didn't cache any of those
links, either. Tears of desire.

------
WalterBright
Source for a JavaScript compiler implemented in the D programming language:

[https://github.com/DigitalMars/DMDScript](https://github.com/DigitalMars/DMDScript)

~~~
thejsjunky
Nice. Interested readers may also want to check out Higgs:
[https://github.com/maximecb/Higgs](https://github.com/maximecb/Higgs)

------
leetrout
So I have question I'm sure someone here is smart enough to answer- simply
put, does ASI affect compilation speed?

In my head I would imagine that a compiler would expect to find semicolons as
statement terminators but would have to reprocess a block if validation failed
and try to figure out where the semicolons should be, and therefore ASI would
slow down compilation (vs a file with all semicolons in place).

~~~
chc
Why would you expect that to be true in JavaScript any more than other
semicolon-optional languages like Ruby and Python?

You just write your parser to accept both line endings (that meet certain
conditions) and semicolons as statement terminators. Going through and
literally inserting semicolons shouldn't be necessary.

~~~
timothya
My understanding (though I'm not an expert) is that the way the JavaScript
spec is written, you first try to parse a line as is, and if you encounter a
parsing error in trying to do that and the line doesn't have a semicolon, then
you insert a semicolon and you try again. The end result is that it's a bit
different then semicolons being optional; they are instead inserted
automatically to try to recover from parse errors.

~~~
mattdawson
FWIW, this is also the way O'Reilly's JS Pocket Reference explains it.

~~~
leetrout
Thanks, that's what I had read.

I also found this set of tests [http://jsperf.com/asi-versus-
msi](http://jsperf.com/asi-versus-msi)

My results on Opera Next (v15)
[http://imgur.com/hkAopfX](http://imgur.com/hkAopfX) shows MSI fastest FWIW.

------
acqq
A very nice text with the nice code examples is:

[http://mrale.ph/blog/2012/06/03/explaining-js-vms-in-js-
inli...](http://mrale.ph/blog/2012/06/03/explaining-js-vms-in-js-inline-
caches.html)

~~~
thejsjunky
If anyone wants to delve more into the workings of a JS VM, I've found the
code of Higgs
([https://github.com/maximecb/Higgs](https://github.com/maximecb/Higgs)) to be
very straightforward and readable. The interpreter and JIT are written in D
and much of the run-time is written in JS. It's a successor to the Tachyon
project mentioned in that post.

------
frozenport
How is the JavaScript JIT different from the Java JIT?

