
Performance Tips for JavaScript in V8 - twapi
http://www.html5rocks.com/en/tutorials/speed/v8/
======
tlack
Naive question from a person that has never had much luck making a compiler:

Instead of all the complicated type guessing and hidden class coercion, why
not build some kind of optional Erlang-style typesec[0] system based on
Javascript comments in the code, and then build your optimization strategy
from that? I don't think asking a programmer to write a typesec for difficult-
to-optimize performance oriented code is asking too much.

[0]
[http://www.erlang.org/doc/reference_manual/typespec.html#id7...](http://www.erlang.org/doc/reference_manual/typespec.html#id75681)

~~~
kevingadd
Detecting types (in the way an engine like V8 or SpiderMonkey does it) is not
completely solved by type annotations. Combine that with the fact that
annotations could end up being wrong, and what you'd really need here is a
language with something approaching static types where the compiler can
accurately infer types for most locations without you giving it any help.

The 'add' function in the linked article is one example where type annotations
would be useless without more sophisticated underlying machinery: The function
is intentionally generic and accepts multiple input types, and this results in
degraded performance for generated code. Adding type annotations wouldn't fix
this; you'd need a more sophisticated type system such that individual
function signatures have individual types and it's possible to do dispatch by
matching the available signatures for a function being called against the
arguments provided (essentially, runtime dispatch for overloaded functions).

Another example (not really covered in the linked article) is the way function
calls end up working in V8 (and to a lesser degree, SpiderMonkey). To
generalize a bit, when you go 'x.foo(y)', a bunch of steps have to occur:

The engine has to identify the type of the expression 'x'

The engine has to identify the type of 'x.foo'

The engine has to identify the type of 'y'

Based on those three types it is possible for the engine to figure out whether
it has already generated usable assembly for the function being called. (the
type checks against argument types may occur within the body of the called
function, but that's more of a minor detail here.) More importantly, when I
say 'type' here, I mean the complete identity of each expression involved - in
a language like C# or C++, I could define foo like this:

void X_Foo (X * this, Y & y);

And then represent the function call with:

X_Foo(x, y);

The problem is that X* and Y& are not actually complete descriptions of the
values being passed in. If you have derived classes from X, X* is only a
partial description of the type being used and doesn't tell you anything about
the body of the functions that might be called through that interface.
Likewise, X or Y could both contain dynamically typed values and the type of
the containing class does nothing to tell you the type of the values.

This is the kind of analysis a JavaScript engine has to do - it needs to be
able to distinguish between an array containing two floats, an array
containing two integers, a Vector2 containing two floats, and a Vector2
containing two integers. Hidden Classes (and SpiderMonkey's equivalent,
Shapes) are a way of tracking some of this finer-level detail on the fly and
making it accessible to optimize things at runtime.

As a thought exercise, given what I describe above (even if it's only
partially accurate), what do you think has to happen if you write 'x.call(y,
z)' in JavaScript (if x is a Function)? Figuring out the steps actually
involved might be surprising!

~~~
tlack
Thanks for the amazingly detailed response! I'm still unclear about one thing:

Couldn't you simply throw an exception if the type annotation did not match
your usage of it? For instance:

    
    
       /* v8-type: string, string -> string */
       function append_strings(a, b) {
         return a*b; /* exception */
       }
    

Similarly, the underlying library could gradually get type annotations, as
Erlang has been doing progressively for the past year or two. As more and more
code gets annotated, you can "lift" that knowledge into any code that calls,
for instance, String.substr(), and assert that the types match when they are
provided.

To me it seems that even the complications caused by multiple types of
arguments to many built-in functions would be less complex to resolve at
runtime than the hairy, "lazy" logic they're referring to in this article.

Iterating through an array of type specs and resolving the correct underlying
code (only in the specific case of annotated functions being invoked - usually
'hot path' code that has a simple breadth and call structure for other
performance reasons), throwing an error if there's no match, seems easier than
Hidden Classes, in my admittedly naive view.

~~~
kevingadd
That sort of explicit type annotation approach falls over for the more complex
cases, like the function signature one I described. For example, let's say you
have an object called a StringBuilder, like the one in Java/C#, and it's got a
method for appending text to it. In JS, you'd use it like this:

    
    
        var bananaCount = 5;
        var sb = new StringBuilder();
        sb.append("There are ");
        sb.append(bananaCount);
        sb.append(" bananas");
    

If you merely add type annotations, you're not going to win much here:

    
    
        /* v8-type: this(StringBuilder) (any) -> none */
        sb.append = function (...) { ... };
    

If you go to the trouble of adding function overloading, maybe things get a
little better - TypeScript does something like this hypothetical:

    
    
        sb.append = {
          /* v8-type: this(StringBuilder) (string) -> none */
          function (...) { ... },
          /* v8-type: this(StringBuilder) (number) -> none */
          function (...) { ... }
        }
    

OK, but now what if you've got collection types? If you create an instance of
List, what type is List.items? It'd have to be any[], unless you also add
something like generics. So now you have to add a generics system or you have
to introduce type casts, just to avoid having all your type information
polluted by 'any's.

This may still be kind of unclear; it is often hard to see the failings of a
type system until you see it in action on a larger application. In practice,
it comes down to this: The type inferences that an engine like V8 or
SpiderMonkey can draw at runtime end up being extremely accurate, and provide
a lot more information than a compiler for a language like C or C# or Erlang
has access to.

