Esp. section 5.3+ starting on physical pdf page 56.
Will they be giving up some speed to claw back some memory?
"Hot" or optimizable code will still be optimized and run just as fast.
But basically V8 has tools built in that will decide when the cost of compiling is worth the possible benefit.
Yeah, I get it. We have a lot of memory these days. Gigabytes. Not always enough for images and videos, but for code? for client-side browser code that is downloaded & run on the fly? Makes you wonder how we got there.
The architecture sounds a little unusual:
"The interpreter itself consists of a set of bytecode handler code snippets, each of which handles a specific bytecode and dispatches to the handler for the next bytecode. These bytecode handlers are written in a high level, machine architecture agnostic form of assembly code, as implemented by the RawMachineAssembler class and compiled by Turbofan"
It also seems as if all calls are mediated by code generated by the compiler, which has the advantage of avoiding the awkwardness of different calling conventions between native and bytecode functions (possibly at some cost to performance?).
Fascinating reading. Thanks V8 people for allowing such documents to be public!
When written in C, is typically done using GCC's computed goto and the && label address-of operator extension.
Writing the handlers in machine-agnostic assembly is interesting; I'm guessing they want to tune the output more than writing the handlers in C would let them, or they can't rely on something like GCC's computed gotos.
Worth noting that IIRC, for a while LuaJIT in interpreted mode was able to beat V8 in optimized mode not all that infrequently (although it depended on the use case, they are different languages, and I do not know if this is still the case).
V8 had no optimizing compiler when Mike Pall sent his (in)famous mail about "LuaJIT interpreter beating V8 compiler".
Also usual disclaimers about cross-language benchmarks apply (e.g. nobody looked how those benchmarks differ between JS and Lua implementation).
Very suboptimal might be a slight overstatement. I can see a way, given known register calling conventions, that you could write an interpreter written as tail calls and post-process the machine code to effectively JMP instead of CALL. Guaranteed tail calls would save you a bunch of effort, and register calling convention would give you some guarantees about consistent allocation.
It actually seems a bit of a shame that V8 and Nashorn are competing despite heading towards very similar architectures.
The rest is familiar enough, true.
While it's great this is part of the culture, discoverability/versioning were still problems, i.e. had to be done yourself.
in Lisp we have compiler options (declare (optimize ...))
Spidermonkey is three tier and JSC is four.
It's strange that there's only one mention of crankshaft in the entire document, but turbofan is all over the place. Are they also planning to get rid of the former?
Actually it's not the colon, there are two protocols: