

Understanding V8 - philipDS
http://s3.mrale.ph/nodecamp.eu

======
ldite
For those like me, who initially assumed that this didn't work in their
browser for some stupid reason, turns out you need to press the space bar or
page up/down.

~~~
eis
You can also click the sides (left or right) of the page. Not very intuitive
interface, got me too :)

~~~
pavlov
I don't understand why anyone would deliberately make a website with such
simple navigation completely hidden.

~~~
mraleph
I am very sorry about unintuitive interface, but these are slides for the talk
I gave on nodecamp.eu, not a website.

I'll add a note on the first slide as soon as I get home.

~~~
pavlov
Ah, I see.

When using the same website both for the actual presentation and for
disseminating the slides afterwards, it might make sense to have some kind of
navigation visible by default. You could then have a "presentation mode"
shortcut key that hides the nav element when you're showing the slides on the
big screen. This way, the same site would serve both audiences.

------
pornel

        In hot code
        Avoid const
    

I wonder why?

~~~
mraleph
Because: 1) const is not supported by an optimized compiler and any function
that declares or references const variable will not be optimized; 2) const is
slower than var because it has a more complicated semantics;

------
kragen
It looks like a fantastic presentation. There were a lot of things I didn't
understand from just the slides though.

~~~
mraleph
Just ask if you are interested in something.

~~~
kragen
I was reluctant to ask because all of these are potentially questions that I
could answer myself with sufficient effort.

What's the relevance of the "GC cost" slide to the two code-change slides that
follow it? Does manually converting tail-recursion have a significant impact
on the number of long-lived objects?

Are you using cons-strings somewhere? (I was thinking of something like Erlang
IO lists. But maybe you just mean the for (...) { x.push(s); } return
x.join('') trick?)

Why is "indexing" relevant in "don't mix indexing and concatenation"? It looks
to me like the problem is building up a big string by repeated concatenation,
no? Not indexing.

What's ~ToNumber? Is that a destructor? Or is it just the conversion applied
by Number()?

What did you use the typed arrays for?

Where do you get HiddenClass from? It's not in the global namespace of my copy
of node.js by default.

Did you have an example where you were able to convert a polymorphic call site
to monomorphic? Also, does V8 use PICs at all, or does it just have a
monomorphic inline cache like in Deutsch & Schiffman?

What's the connection between dictionary mode and the code on the dictionary-
mode slide?

What causes the arguments object to materialize?

On "Two types of variables", it appears that retaining g will result in v3 not
being garbage collected, potentially resulting in retaining an arbitrarily
large amount of garbage. Does that really happen?

~~~
mraleph
1\. V8 does not perform tail call elimination (AFAIK none of the major JS VMs
perform it) because certain legacy JS language features (namely func.arguments
and similar) are not TCE-friendly (see for more details
<http://code.google.com/p/v8/issues/detail?id=457>). So manually performing
TCE indeed reduces number of (relatively) long lived objects: tails of the
string created with substr.

2\. I am not sure about this question. V8 uses cons-strings internally: so it
is important to understand that c = a + b does not result (at least not
immediately) in an allocation of a new sequential string and copying
characters from a and b to it. That means that concatenating many strings does
not cost much (especially if they are large) at the moment of concatenation.
Actual concatenation happens later, when you start indexing into this string:
V8 will flatten it (convert from cons-string form into sequential form).

3\. See 2. If you intermix indexing and concatenation you basically force s to
oscillate between cons and sequential forms: cons-string is created, then
flattened on indexing, and result is concatenated with the string forming a
cons-string which gets flattened... Resulting performance is bad because cons-
strings actually become pure overhead (you waste more on their creation than
you benefit from them) in this cycle of concatenation and flattening.

4\. '*' as a prefix means optimized function and '~' as a prefix means non-
optimized function.

5\. Everything starting from 'Understanding Numbers' slide is a collection of
facts unrelated to the first part of the talk about profiling. To get the idea
of how typed arrays affect performance you can read my blog post:
[http://blog.mrale.ph/post/5436474765/dangers-of-cross-
langua...](http://blog.mrale.ph/post/5436474765/dangers-of-cross-language-
benchmark-games)

6\. HiddenClass slides try to illustrate the idea of inline caching
(<http://en.wikipedia.org/wiki/Inline_caching>) without using assembly. It's
pseudo-code. HiddenClass is a structure that describes layout of an object.
Slides #42-#50 illustrate how VM builds hidden classes hierarchy while
executing the code. Slides #51-#52 try to illustrate how you can utilize
hidden classes when you compile code. There are couple presentations that
discuss hidden classes at the lower and more detailed level: for example Mads
Ager's talk at Google IO 2009: <http://youtu.be/FrufJFBSoQY>

7\. No, there were no example (it's impossible to fit that much into a single
40m talk). In V8 ICs have a special megamorphic state in addition to
monomorphic state. Megamorphic stubs are not specialized for types IC had seen
(unlike monomorphic stubs), instead they rely on a cache of monomorphic stubs
(indexed by hidden class of the receiver).

8\. Those are functions that force object from fast-mode to dictionary-mode.

9\. In non optimized code it is materialized when used in the code. In
optimized it is never materialized (but as the slide says: if you "misuse"
arguments in a function it will not be optimized).

10\. I have not seen any real world code that suffers from this problem. But
people ask about scopes and closed-variables quite often that is why it is
included.

~~~
kragen
> V8 uses cons-strings internally

Ohhh. I had no idea. That makes the next point a lot more understandable.

> '~' as a prefix means non-optimized function.

Where does ToNumber come from? It sounds like the kind of thing that ought to
get optimized!

> Those are functions that force object from fast-mode to dictionary-mode

You mean, if you use delete, getters, setters, seal, or freeze, then the
object will use dictionary-mode thereafter under he covers?

> if you "misuse" arguments in a function it will not be optimized).

What constitutes "misuse"? Anything other than .length, [i], and applying a
function to it, I guess (e.g. passing it to some other function such as
Array.prototype.slice)?

> I have not seen any real world code that suffers from this problem.

So it exists theoretically? The benefit of being implemented this way would be
that creating closures is a lot faster and they use less space.

~~~
mraleph
> Where does ToNumber come from? It sounds like the kind of thing that ought
> to get optimized!

ToNumber comes from Number(l). I can't see how it can be optimized (l is a
string so parsing is unavoidable).

> You mean, if you use delete, getters, setters, seal, or freeze, then the
> object will use dictionary-mode thereafter under he covers?

Yes.

> What constitutes "misuse"?

Your understanding of misuse is correct.

> So it exists theoretically?

Yes.

~~~
kragen
Thank you very much!

------
andypants
Awesome! Is there a video of the presentation somewhere, or a transcript? I've
love to hear/see the talk.

~~~
mraleph
Unfortunately the talk was not recorded. There is no transcript either.

~~~
devinus
Will there be a transcript available in the future?

~~~
mraleph
Probably not. Everything is there in the deck.

If people will have a lot of questions about slides (there is nothing novel
there though) I can do a follow up post.

~~~
darklajid
I'd be glad if you could do that.

Might be a lot to ask, depending on the time needed to write a couple of lines
- but I guess right now it's most helpful for the people that are already
educated on v8 performance.

It's just that you could easily educate a broader audience ( _cough_ \- like
the author here) with a short write up. Some things are 'obvious' (GC
pressure), some (everything about 'In hot code', some of the trace parameters)
things are opaque and ~meaningless~.

~~~
kragen
"In hot code" means "in code that comprises a substantial part of your
execution". Presumably if you run try-catch or eval in non-hot code, it still
runs slow, but not slow enough to make your whole program slow.

I wondered too if they had more to say about the usefulness of some of the
trace parameters. I am pretty sure they refer to what the _profiler_ traces
(and logs), but I don't know whether they got any usefulness out of them.

~~~
darklajid
Sorry, my bad. I should have explained it better.

What I meant was: The whole 'in hot code' part of the slides didn't make sense
to me, because it's something like a list of (unexplained) statements. I know
what 'hot code' refers to, I just don't particularly like to accept things as
facts without a minimum of explanation. Probably the author is dead on and I'm
sure he knows a lot more about the technology than I do, but 'don't use X, use
Y' or 'avoid Z' is something that should be qualified.

Probably the talk did that.

------
sylvinus
One of the best presentations of nodecamp.eu

