
Mozilla’s new (internal) JavaScript value representation - mbrubeck
http://blog.mozilla.com/rob-sayre/2010/08/02/mozillas-new-javascript-value-representation/
======
malkia
The double float tagging using NaN sounds like LuaJIT's approach

quoting <http://article.gmane.org/gmane.comp.lang.lua.general/58908>

\- NaN-tagging: 64 bit tagged values are used for stack slots and table slots.
Unboxed floating-point numbers (doubles) are overlayed with tagged object
references. The latter can be distinguished from numbers via the use of
special NaNs as tags. It's a remote descendant of pointer-tagging.

    
    
      [The idea dates back to 2006, but I haven't disclosed it before
      2008. Special NaNs have been used to overlay pointers before.
      Others have used it for tagging later on. The specific layout is
      of my own devising.]

~~~
andhow
It is also used in WebKit's JSC. Using the NaN space goes back much further
than LuaJIT. E.g., you can see it mention in this 1993 survey paper:
<http://lambda-the-ultimate.org/node/3912>.

~~~
malkia
Thanks for the paper! Quite interesting and easy read.

------
jws
Clever, but slightly scary. They tag their object/string/integer/floats by
storing them all in 64 bits, top 32 bits is tag, noting that there are vast
swathes of values in the IEEE float format that are all NaN and the hardware
and libraries only use one.

But woe to the poor beggar that uses the wrong NaN.

And fear the nasty man that slips the wrong NaN in through a tricky mechanism
and turns it to a pointer.

(For similar problems, I was always fond of the mechanism in the Mindy Dylan
interpreter. There all pointers had their low bit set, and integers had a low
bit of zero. The nifty part being that integer math still worked without
shifting and for pointers you are generally indexing to an offset anyway so
take the one off there and it is free. Mercifully I've forgotten why I had to
read enough of their source code to discover that.)

~~~
nostrademons
This representation reminded me a lot of the one used in Gwydion Dylan. That
also used a two-word representation - mostly, if the compiler could statically
infer the type it'd leave off the tag - with one word for the type tag and one
word for the value.

IIRC, they ran into some very serious problems with threading. If you use a
two-word representation, you can't atomically set a value - there's always the
chance that the thread will be pre-empted in between setting the value and the
tag and some other thread will be able to run. This of course is disaster,
because you'll get a value interpreted as a type that it's not.

I wonder how Mozilla plans to handle this. Is their JS engine inherently
single-threaded only? Fine-grained locking seems impractical, because you'd
need locks around every assignment. Would they do something like Python's GIL?

What's the reason behind this representation anyway? It was an interesting
experiment when Gwydion did it, but I don't really see the practical benefit.

~~~
pcwalton
JavaScript will never support threads in the usual sense, as long as /be has
his say:
[http://weblogs.mozillazine.org/roadmap/archives/2007/02/thre...](http://weblogs.mozillazine.org/roadmap/archives/2007/02/threads_suck.html)

So JS won't need fine-grained locking in the foreseeable future.

~~~
jimbokun
"A requirement for JS3 (along with hygienic macros)"

Did anything ever come of hygienic macros for JavaScript? (first time I heard
that)

------
yason
How is this new but merely evolutional?

You know, as soon as you're willing pass 64-bit values around, packing
different values into a long long isn't really _packing_ anymore. There are so
many bits that you can easily "waste" a few of them for tags. The downside is
that you most often really just need only a handful of bits and it looks like
Jaegermonkey lavishly uses 32 bits for tagging of 32-bit pointers.

Of course, thinking optimistically, the new approach is better if you can at
least carry some satellite information in the 32 tag bits, like object
origination/permission/context data, or some length/offsets for (sub)strings.
Also, locality is probably better, heap allocation is eliminated, and for
64-bit registers and moves this is a natural fit.

But I would have liked to see those 61-bit integers for the same amount of
money.

~~~
andhow
Of course this is not new. I would have to assume that every value
representation under the sun has already been invented and probably during the
60s.

61-bit integers would either require long long integer arithmetic on x86 or
require more dynamic branching for arithmetic ops either of which would slow
down very hot paths.

------
tedunangst
Is it just me, or does the first example assign to 4 variables, not 3?

