"DFG" usually refers to some sort of dependence graph. What IonMonkey and Crankshaft have is a traditional CFG (control flow graph) in SSA form. That means the code is represented as a set of straight line code paths (basic blocks) connected edges in a graph representing possible control-flow transitions. The instructions are in SSA form, which means that each variable is assigned only once and pseudo-instructions called "phi instructions" are used to merge results when a basic block is accessible from multiple predecessors blocks, both of which might assign the same variable. A "syntax tree" is different yet. It's a higher level representation that preserves the syntactic structure of the code. For example, it represents loops and "if" statements as nested elements in the tree. Translation to a CFG throws away most of that information (e.g. reducing loops to simply control flow edges in the CFG).
For a (relatively) approachable set of materials on the subject, see: http://www.cs.rice.edu/~keith/512/2011/Lectures. I also highly recommend Cooper & Torczon's "Engineering a Compiler" (2d Ed.) In a world of CS researchers that can't write an English sentence to save their lives, Keith Cooper and Linda Torczon's work is a paragon of clear prose. (Their lab at Rice is where Cliff Click got his PhD, and if you've read his work on the JVM, you know that he too has a talent for explaining complicated concepts in plain English.)
Do you mean idiosyncratic? Not trying to nitpick. I was genuinely confused until I concluded that it must be a typo.
- How compilers work: http://creativejs.com/2013/06/the-race-for-speed-part-2-how-...;
In my head I would imagine that a compiler would expect to find semicolons as statement terminators but would have to reprocess a block if validation failed and try to figure out where the semicolons should be, and therefore ASI would slow down compilation (vs a file with all semicolons in place).
Furthermore, parsing source text initially into a representation that can be worked with is like 0.000001% of the total work to be done. And it is cached too.
Another example is in PHP where you can open and close <?php ?> tags as many times as you like and it will perform exactly the same compared to equal code
with single open and close tag. Especially if you use PHP opcode caching where the source text is not even used for subsequent requests.
Having said that, I'm, sure it's such a small amount of the compilation time to be pretty insignificant.
You just write your parser to accept both line endings (that meet certain conditions) and semicolons as statement terminators. Going through and literally inserting semicolons shouldn't be necessary.
Python does make some exceptions, in that newlines within balanced paratheses and brackets are not treated as statement separators. But outside of that, this is what they mean.
As an example of the difference, consider the following JS:
Another example. This is legal JS:
I also found this set of tests http://jsperf.com/asi-versus-msi
My results on Opera Next (v15) http://imgur.com/hkAopfX shows MSI fastest FWIW.