Hacker News new | past | comments | ask | show | jobs | submit login
Chromium Blog: A New Crankshaft for V8 (chromium.org)
242 points by twapi on Dec 7, 2010 | hide | past | favorite | 90 comments



The Chrome javascript engine team is simply a beast.


I'm curious to see if they broke any code, like the IE engine did recently. For example, with loop-invariant code motion, what is legal in a language like C, may not be in Javascript (for same reason the IE DCE optimization was invalid).

I'd find it hard to believe that Goog would make the same mistake after all the hullaballoo, but I'd love to see it validated.


I'd like to have a guide for writing code that is easily optimized by V8 and similar engines. Having local variables is good, as far as I can tell, but it would be nice to have a full overview with dos and don'ts


"Having local variables is good"

Reading sentences like this scare me because it reminds me that some people don't know what the 'var' keyword means or think it's acceptable to shove everything in window or global.


Last I read, JavaScript has 2 scopes: function scope, and global scope. Unlike many languages, there is no block scope (e.g. a counter in a for loop).



It's definitely not available in current JavaScript interpreter in different browsers.


On the other hand, JavaScript allows you to nest functions, unlike many languages.


I feel dumb now because I fear I don't understand what you mean when you say block scope...

    for(var i=0; i<3; i++) { console.log(i); }
appears to be valid... (or just in Chrome?)


Valid, just doesn't do what you think.

function foo() { var x = 1; if (true) { var x = 2; var y = 3; } console.log(x); console.log(y); }

foo(); // 2, 3 (vs 1, undefined if there was block scope)


Thanks for the explanation!


JS performs "variable hoisting", so your example is interpreted as:

    function(){ var i; … for(i=0; …
It has surprising side effects:

    alert(i);
    var i;
works and outputs "undefined", which is value of the variable i that is defined.

    alert(k);
    var i;
This is an error, and will fail to execute because of undefined variable.


The IE team didn’t break anything in shipping software, right? This is just an IE9 beta problem, presumably fixable before they ship a stable IE9?


Absolutely. It wasn't even beta, it was just a tech preview. If they didn't have bugs of that sort, they aren't pushing things hard enough.


Loop-invariant code motion is still legal in JS, and most dynamic languages, as is DCE. You just need to take into account the weird semantics of JS.

What the IE team got wrong was the assumptions they made about the semantics of the code that was optimized. If they had checked for the presence of a valueOf property, they could still have done the optimization.


Exactly. I'm not saying those opts aren't legal, but in cases that look legal in C, they're not legal in JS. And note, you can still do it in the presence of a custom valueOf method, as long as it doesn't have side effects (and thus of course loop-invariant itself).

In essence I'm asking if Google does in fact do this check, and does analysis to ensure that the valueOf method doesn't get written dynamically in the loop itself.


"...performance of JavaScript property accesses, arithmetic operations, tight loops..."

Does this mean Crankshaft includes a tracing JIT like Firefox? This layman speak confuses me.


Yes, look at the list of the four main components.


I don't think that says it uses a tracing compiler (naturally the terms are vague in this field, so I'm not certain). Their architecture looks much more like HotSpot than TraceMonkey.


Especially considering that HotSpot and V8 were designed by the same person.


I don't think this is true. Do you have a reference for that?


This may not be 100% literally true, but it is definitely true in spirit. (The previous poster is referring to Lars Bak, but both Hotspot and V8 are/were team efforts.) The family tree here is:

  self->hotspot->V8
But, yes, Lars is the man.


    self->hotspot-> Resilient Smalltalk Embedded Platform ->V8
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.84.7...


OK, you got me on that one. I've been a professional Smalltalk programmer for almost 20 years and I've never heard of the Resilient Smalltalk Embedded Platform.


aka OOVM it took a novel and interesting approach - use Eclipse as the IDE to edit text files (there was a syntax for class definition) and sync bytecode with an image on a remote device through tcpip.



It will be interesting to see how this optimization affects V8's memory profile (and how that in turn affects the currently slim memory profile of Node.js).


Looks like this Node patch includes Crankshaft: https://github.com/ry/node/commit/c30f1137121315b0d3641af6dc...


It will also be interesting to see if this improves performance of long running Node.js scripts.


When Chrome was released two years ago, I noticed a significant difference in speed. Nowadays I think the announcements JavaScript performance improvements is a bit overenthusiastic. The only real world benchmark mentioned in the article is that Gmail loads 12% faster. What JavaScript apps are constrained by performance and what are Crankshaft's effects on them?


I think the point of these optimizations is that they allow javascript-intensive applications to be developed that would otherwise have just been too slow.

In other words, they're paving the way for the future more so than trying to squeeze every last ounce of speed from current applications (which just happens to be a great side effect of their work).


Yeah, you won't see a lot of applications that benefit from these optimizations immediately because if there were any that would mean that they had been written before there was any device capable of running them.


I'm happy that the various javascript teams are developing towards where the web is going rather than where it is.

"A good hockey player plays where the puck is. A great hockey player plays where the puck is going to be." — Wayne Gretzky


I don't think the GP would disagree with you. I believe he was asking what specific apps benefit today (besides GMail which was mentioned in the blog).


The coming WebGL games desperately need all the performance they can get.


I'm guessing you're right. I was responding to the tone of "Nowadays...overenthusiastic" — in support of enthusiasm — and to the notion of naming specific current apps — which I'm imagining as the puck's current location.

We may not even be equipped to answer that question. There are a lot of web-based but private & internal corporate applications that we'll never know about, and can't hope to name.


Much web client JS is constrained by dom speed so this will only have an incremental effect. However, for node apps this willbe pretty significant.


> What JavaScript apps are constrained by performance and what are Crankshaft's effects on them?

Canvas and WebGL games.


unreal. how much theoretical headroom is left to optimize js compiler performance?

I had assumed we were reaching some theoretical upper-bound because all major frameworks were on par in terms of performance.


To a first approximation, my answer would be something like http://shootout.alioth.debian.org/u32/benchmark.php?test=all... .

That's not necessarily the whole answer and I imagine JS can't ever quite go that fast. But still....


> and I imagine JS can't ever quite go that fast

I'm not the most knowledgeable person on the subject, but from what i understand, there is no theoretical reason that JS couldn't go that fast. The two languages are more similar than they are different, even if JavaScript is quite more complex.

I remember Mike Pall saying something similar in an LTU thread some time ago.


> I remember Mike Pall saying something similar in an LTU thread some time ago.

He did and the mozilla guys pointed out how this wasn't the case. The languages are very similar but JS has some weird semantics due to how things are scoped. (ref. the Chakra optimization brouhaha a month ago)


Going the other way: what small changes in languages can we make to make them much more optimize-able while at the same time keeping (most of) the expressiveness?


It's a tradeoff between compatibility with existing JS versus perf/capability enhancements. The ECMAScript committee goes back and forth on this topic. The biggest stride in that direction was strict mode, which is intended to catch most of the low hanging fruit. I know Brendan Eich has mentioned a number of things that could change to get the language faster. I don't, however, know if I read it all in one place or I'm combining nultiple one-off examples.

I'm not confident enough in my memory to write out a probably incorrect list of things I remember. Here's a list of the various things I've read from Brendan in case you're interested in tracking it down:

http://lambda-the-ultimate.org/node/3851#comment-57671

This is the LtU thread my previous comment referred to. It's a large (and fantastic!) thread, but I believe that comment and children are the most direct comments back and forth between Mike and Brendan. Andreas Gal is also on the thread and from Mozilla.

http://www.aminutewithbrendan.com/

Brendan's weekly JS podcast. They're relatively accessible and generally cover a lot of area. A good way to get in the language zeitgeist.

http://brendaneich.com/

Brendan's blog, mostly focuses on ES Harmony stuff and Mozilla specific topics.


Would this imply that Lua has reached some pinnacle of speed and can't go any faster? That seems to be a side effect of your statement.

I'm not familiar with Lua, beyond reading an article or two about it, but does its simplicity imply some sort of maximal efficiency? Are the developers behind Lua simply the best programmers in the world and already have everything figured out with regard to optimizing a JIT? I'm not arguing with you...it does seem like that's a reasonable goal for JavaScript JITs to strive for in the near future. But, it doesn't really answer the question of how much better performance can get (in JavaScript or Lua or any other language). Past performance is not necessarily indicative of future performance when so many people are working on the problem from so many angles.


Browsing the alternatives in the shootout dataset, LuaJIT appears to be the fastest of the dynamic languages and feature-wise matches Javascript well enough to be a fair benchmark.

You can gain another factor of 2 or so in speed by going to a static language like C or Ada, but that isn't really a fair comparison and you can see the price paid in code size.

The good news for the web is that there may be another factor of 2 to 3 available for Javascript speedup.


You can also go to a static language like OCaml. They don't blow up your code size, but are also fast.


LuaJIT is a major outlier, probably the most surprising result in the entire shootout. In point of fact there isn't much more LuaJIT can do without flat-out exceeding C, which isn't going to happen on the shootout to any significant degree any time soon.

(The conventional "JIT can be faster than compiled code" argument doesn't apply because the problems are accurately known by the author of the shootout benchmark code in advance, so, for instance, if there's a speed advantage to sticking with 'char' where you might have been tempted to write 'int', the C shootout code already does that.)


That argument is often made for JITs, but I have never seen a real world example where extra runtime knowledge that JITs have is used that couldn't be done better by a static compiler, except in the cases where runtime code loading is used.


Alias analysis is a good example. A JIT compiler may speculatively add dynamic disambiguation guards (p1 != p2 ==> p1[i] cannot alias p2[i]). If the assumption turns out to be wrong, the JIT compiler dynamically attaches a side branch to the guard using the new assumption (p1 == p2 ==> p1[i] == p2[i], which is an even more powerful result).

Doing this in a static compiler is hard, because it would have to compile both paths for every such disambiguation possibility. This quickly leads to a code explosion. You'd need very, very smart static PGO to cover this case: there are no branch probabilities to measure, since the compiler doesn't know that inserting such a branch might be beneficial. It may only derive this by running PGO on code which has these branches, which leads to the code explosion again.

Auto-vectorization is another example: a static compiler may have to cover all possible alignments for N input vectors and M output vectors. This can get very expensive, so most static compilers simply don't do it and generate slower, generic code. A JIT compiler can specialize to the runtime alignment and even compile secondary branches in case the alignment changes later on (e.g. a filter fed with different kernel lengths at runtime).


I agree in general, although I will point out that virtually no C developers use PGO while it's on by default in HotSpot and now V8. (Of course, it looks like Java needs PGO just to try to catch up with gcc -O3.)


Not strictly the same but take a look at the CPU world:

VLIW (compilers try to optimize processing based on static knowledge - e.g. Intel's Itanium) vs. the current Intel CPUs (based on P3/P4 architecture) which dynamically allocate resources depending on runtime knowledge.

Runtime information can help compilers. Just look at profile guided optimizations in current static compilers.

The real trouble in JIT compilers is usually that the target languages semantics are very high level. For example an integer in C is machine sized and is not expanded in size to fit its value -- unlike some dynamic languages.


http://weblogs.java.net/blog/2008/03/30/deep-dive-assembly-c...

This link doesn't quite give what you are after (it's mostly about static compilation in the Java HotSpot compiler), but I believe the lock elision features (http://www.ibm.com/developerworks/java/library/j-jtp10185/in...) have to be done at runtime in the JVM (because of late binding).

Obviously this doesn't totally invalidate your argument ("except in the cases where runtime code loading is used"), but it is worth noting that in many languages late binding is normal, and so this is the general case.

Also, HP's research Dynamo project "inadvertently" became practical. Programs "interpreted" by Dynamo are often faster than if they were run natively. Sometimes by 20% or more. http://arstechnica.com/reviews/1q00/dynamo/dynamo-1.html


This is a common misinterpretation of the Dynamo paper: they compiled their C code at the _lowest_ optimization level and then ran the (suboptimal) machine code through Dynamo. So there was actually something left to optimize.

Think about it this way: a 20% difference isn't unrealistic if you compare -O1 vs. -O3.

But it's completely unrealistic to expect a 20% improvement if you'd try this with the machine code generated by a modern C compiler at the highest optimization level.


I think http://shootout.alioth.debian.org/u32/benchmark.php?test=all... is probably a better ceiling. LuaJIT is fantastic and already generates better code than GCC in some cases, but in many others it does not.

There's no theoretical limit to how close a compiler can come to a programmer when it comes to generating machine code to do a particular well-defined task.


Javascript execution is very rarely the bottleneck on webpages. The bottleneck is almost 90% render speed of dom updates.


So - how long does it take for features to make their way into production? Am I reading the release calendar correctly in that it'll take 12 weeks from start of development to beta, and another 12 weeks from beta to stable?

It looks like Chrome 8 went stable on 12/2. So we'll see Chrome 10 in 4 months?


Chrome stable releases are every 6 weeks.

The releases are overlapped though, so we are testing v n+1 in beta while v n is in stable, and we are starting new feature development for v n+2 while v n is in stable.

Chrome 8 just went stable, and we've just started testing Chrome 9.

So crankshaft will either be 6 weeks from today (if it's in 9), or 12 weeks from now (if it's in 10).

I don't think it's been announced which it's targeted for.

HTH


Ah, ok. I guess I misread the release calendar. Thanks for the clarification :).

> I don't think it's been announced which it's targeted for.

According to the perf comparison chart on the blog, it's in Chrome 10. Also, it's mentioned that Crankshaft is available in the canary build, which is currently at 10 too.

12 weeks then. I sometimes revert back to FF for the plugins, but I always come back to Chrome for the perf :).


Gmail really does seem to load in about half the time now in the Canary build.


OTOH this does not bode well for the future potential bloat in Gmail.


I've benchmarked Google Chrome 9.0.597.10 and 10.0.603.3 (with Crankshaft) and the latter is 30% faster. See the detailed results: http://dromaeo.com/?id=124912,124913


Has anyone got any experience using javascript/V8 as a scripting language for a C++ app? We currently use python with boost::python bindings, but are finding we have to limit the amount of python code, as it is too slow.


Have you looked into lua? It's a good embedded scripting language.

http://www.lua.org/


Especially, for x86 and x64, where you can use LuaJIT.

LuaJIT2 is still in beta, but it is already very stable. Performance-wise, it is comparable to Haskell and Java.

http://luajit.org/


Are there any plans to port LuaJIT to ARM or LLVM? I see a couple of posts mentioning slow FP performance on ARM, but that could be solved with a technique like that used by LNUM.


The LLVM IR is too low level. It loses some context necessary to get the performance of the tailored JIT written by Mike Pall.

There is a separate effort to write a JIT complier for Lua on top of the LLVM, but the performance is not as good as LuaJIT, and reaching such a level will be very complex, (assuming it's possible).


There is sponsorship for a PPC LuaJIT port (targeting embedded systems, I believe) and Mike Pall has expressed interest in an ARM port in the past, but I don't know what the status of that is.


PPC LuaJIT sounds like it would be useful in console games (Why did the big three consoles switch to PPC just as Macs switched to Intel, anyway?).


Also, the language itself is pleasant and straightforward. If you know other modern programming languages, Lua shouldn't be too surprising.


Ever looked into Cython? http://www.cython.org/

Sage (a open source replacement for matlab) uses it quite successfully for speeding up critical paths.


Thats essentially what Node.js is, although with a heavy focus on asynchronous operation.

Check out: https://www.cloudkick.com/blog/2010/aug/23/writing-nodejs-na...


Syntensity moved from Python to JavaScript as an embedded scripting language,

http://www.syntensity.com/toplevel/intensityengine/

worked out very well there.

Of course the real examples are... web browsers, which are C++ apps that are scripted by a JavaScript engine. Seems to work good there as well ;)


At Anybots, we built all the real-time robot code in Python with performance-critical bits in C++ wrapped using boost::python. Server code is in pure Javascript on Node.JS. I haven't tried to integrate C++ into Node, but it looks easy.


Next up, remove the 1.9GB max memory limit of V8 processes: http://code.google.com/p/v8/issues/detail?id=847


Will this have any effect on NodeJS?


I doubt it.

From what I understand, the significant improvements in speed come from Crankshafts tradeoff of compilation optimisation for startup speed. If your app is a for loop with 2 iterations that code path won't be heavily optimised as the interpreter would potentially be more spending more time compilation code than in execution of unoptimised code. It will therefore startup faster. However, hotspots (loops with 1,000 iterations, per se) will be heavily optimised.

This is great for websites as speed and responsiveness is perceived as startup time. You'll certainly notice a difference when using the Node as a scripting tool. However, most Node applications are long running servers executing the same code paths over and over. Its unlikely that Crankshaft is performing any extra optimisations, it is just changing when it performs these optimisations. However, if Crankshaft _is_ doing significantly more advanced optimisations (I don't know) then, yes, Node will benefit. Please correct me if I am wrong, I would love to be.


Node-based servers will definitely benefit from this. The advantage of a two-stage compilation scheme is that the "base" compiler generates non-optimized code that is self-profiling: it collects type information as it runs. That information is then used by the second-stage compiler to produce code that is more optimized than a single-stage compiler (like pre-Crankshaft V8) can produce.


From the article:

> In addition to improving peak performance as measured by the V8 benchmark suite, Crankshaft also improves the start-up time of web applications such as GMail.

As I understand it, there is more to Crankshaft than just startup time improvement.


Sure, I would expect node.js to be an ideal environment for this type of hot code analysis, since servers written with it are typically long running vs. more transient web pages. Of course whether there are significant gains depends a lot on whether your server is CPU-bound or IO-bound.


This seems like it would be huge for the Node.js people. The next big milestone for V8 for node has to be the GC issues that have been wreaking havoc on node apps under high load.


Seems like it should.


Sounds like they borrowed the tracing idea from mozilla.


Not really. It's a well-known VM optimization technique.


First featured in Self many years ago, done by the same folks (Lars Bak and crew) who brought you V8. Then Sun bought their Smalltalk/Self-based company, and they built HotSpot for the JVM. Then Google hired them to do the same for Javascript, and now we have V8.

It generally takes about 10-20 years to get truly new ideas from the labs to consumer-level products.


Neither Self nor HotSpot uses tracing. It's a relatively new compilation technique, the implementation in the original tracing paper used java bytecode as the source language.

I think you are confusing tracing with adaptive compilation.


Actually there's even older work using tracing for re-optimization of assembley :)


They don't mention that they're tracing. I think they're just optimistically optimizing functions in hot loops.


I thought someone was actually developing a new kind of crankshaft for a real V8 engine....


Doesn't `Chromium Blog' and the URL give it away?




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: