

JavaScript performance and a Webkit hotspot - yread
http://my.opera.com/emoller/blog/2011/05/01/javascript-performance

======
pbiggar
Mozilla JS engineer here. It goes without saying that I don't speak for
Mozilla here, or even for the rest of the JS team. And I may misremember
stuff, so don't cite me too closely.

Benchmarking is very difficult, and not just in JS. I worked on PHP before,
and benchmarking was crap there too. Same in Lua [0]. Same in Python until the
Unladen Swallow team packaged up some real-life applications. Even in Java,
things were bad. They had SpecJVM98, and it so mischaracterized real Java
programs that the Dacapo group went and formed and built benchmarks so they
could optimize their research correctly.

What is an ideal web benchmark? Tough question. Parts of Sunspider and V8
tried to look at what the web was and distil it into a useful benchmark -
that's a good start [5]. Kraken tried instead to look forward at what the web
might be, even though people aren't doing that stuff just yet - that's good
too. Perhaps what the web is _now_ would be good, but we don't have anything
like that [1] [2].

Add to this that benchmarking is inherently hard, so hard that it's its own
research area. How do you weight various benchmarks to produce the really
useful _single number_ that we all want to see? That's kinda hard, and back
when I read research on this, I didn't see a particularly useful answer.

Let's consider what we'd love to have. Run a web page for a little while,
record it somehow, and package that up so that it can be reproduced in a
useful way [3]. So what should go in it? I mainly use gmail, google reader,
bugzilla, hacker news, and small amounts of other sites. What if you read
nothing but Jezebel and Facebook? Will my benchmark help you? Will my
benchmark even help me in 6 months after those sites are updated? Whose view
of the web is the right one, at which time, and how will we coordinate that
into a benchmark?

Which brings us back to the original post. That's benchmarking a decompressor,
ported from C, which doesn't sound like a great real-world application to me.
Its hot-loop seems to be string appending, which Firefox is great at because
of ropes [4]. We actually have a few benchmarks like this. One of the V8
benchmarks uses a Scheme-to-JS compiler to produce code. And Emscripten, an
LLVM-to-JS compiler, produces JS apps in the same way to OP did, and we
measure these periodically to make sure we don't regress. But the OP's
benchmark is hardly representative of real web usage, and which particular
implementations did well here is at least partially down to luck.

So to summarize, benchmarking is hard, no-one is doing it right, I'm not even
sure we know what right is, but most of us can agree that ported C apps are
probably not representative of real browser usage.

And I leave you with a final reminder that I don't speak for Mozilla.

[0] This was true when my mate worked on Lua two years ago.

[1] If you can package Gmail and Facebook into a nice platform-independent
package that we can run and get _one single number_ representing its
performance, we will happily use it and worship at your feet.

[2] Microbenchmarks (how fast is string appending, for example) aren't the
answer either.

[3] On multiple platforms, while doing the same thing on each.

[4] Other posters commented that string-appending is an O(n^2) operation -
this isn't quite true. The naive implementation is, but JS vendors haven't
been naive since at least 2007.

[5] While not representative of the real web, these benchmarks helped us run
our little JS performance war, right from slow-and-sad up to fast-and-awesome.

------
pennig
Looks like WebKit has problems appending to a very long string. The less we
can do that the better, so I came up with this solution:

    
    
      var dest = 'apple';
      while (dest.length < 100000) {
        var out = '';
        for (var i=dest.length-5, end=i+10; i<end; ++i)
          out += dest[i];
        dest += out;
      }
    

That brought the time from about 4 seconds to a more respectable 90 msec.

~~~
nod
That reduces the number of string appends, sure. But it is well known (see
either "The Good Parts" or "High Performance Javascript" that appending
strings is quadratic in JavaScript, and that the best way to assemble a large,
multipart string is arrayOfStrings.join('');

* And by quadratic I mean "copies the whole string every time" such that an append loop is N^2 while a push+join is linear.

~~~
rimantas

      But it is well known (see either "The Good Parts" or "High
      Performance Javascript" that appending strings is quadratic 
      in JavaScript, and that the best way to assemble a large, 
      multipart string is arrayOfStrings.join('');
    

It's a bit fuzzy now, modern browser optimized that quite a bit, so this
mainly applies to older versions of IE.

------
mraleph
Array is an appropriate datastructure for this algorithm. String is not.

    
    
        var dest = 'apple'.split('');
        var j = dest.length;
        var start = new Date();
        while (j < 100000) {
          var offs = j - 5;
          for (var i = offs; i < offs + 10; ++i)
            dest[j++] = dest[i];
        }
        dest = dest.join('');
    

=> 7-10ms in the latest Chrome.

~~~
mjs
Arrays do seem to be the way to make Chrome happy, although unfortunately
they're slower for everything else:

<http://jsperf.com/backreference-copy>

I don't see why you say an array is a more appropriate datastructure for the
algorithm, though--that seems very much implementation dependent. And at the
JS level, arrays support operations such as splice that strings do not,
suggesting that they'd typically be implemented with a more sophisticated
datastructure, and hence slower at primitive operations like index access.
(Perhaps V8 notices that these expensive operations aren't actually used, and
creates a fast, basic version instead.)

~~~
mraleph
Array is intended to be a mutable storage with a fast random access and
optimistic preallocation of backing store. String is intended to be an
immutable storage. At least that's how I see these types.

I also would not expect that VM will use representation optimized for splicing
--- you should _always_ expect that splicing will take O(arr.length) and use
appropriate data structure when you want better time bounds.

There is a limit to how VM can optimize use of data structures. It can't
magically guess that you wanted list or binary tree instead of an Array.
Introducing many special cases/heuristics also leads to complexity and as a
result to bugs.

------
masklinn
Not exactly surprised, appending to immutable strings is rarely fast, and
rarely considered an interesting hotspot to optimize: you can usually fix it
away by using arrays and joining them at the end.

------
kqueue
I don't think this is webkit specific. Chrome uses a different JS engine than
other webkit based browsers.

------
Scaevolus
When code runs slower in one JS implementation than another, why do people say
that "real-world" code is slow?

Just because the JIT missed an optimization-- in this case, repeatedly
appending to a string-- doesn't mean that it's a bottleneck for the JS that
browser run into in the wild.

People don't often run jslint or jsunzip frequently, they run jquery and do
lots of DOM accesses.

------
mjs
splice() might be able to help out, though how much depends on whether the
copying really needs to overlap or not. There's some benchmarks at:

<http://jsperf.com/backreference-copy>

I couldn't actually get Safari to complete; FF 4.01 is fastest by a long way.

------
gorset
Appending to a immutable string in a loop is O(n^2). I'm surprised the author
is surprised by this. Python and Java is the same.

Sure, it's possible for the vm to optimize this by either using ropes or
convert to a buffer based on some kind of code analysis, but you shouldn't bet
on it.

------
tlrobinson
Another unzip implemenation <https://github.com/tlrobinson/zipjs> (should be
easy to adapt for browser use)

