

WebKit sorts JS arrays using Selection Sort - lars
http://trac.webkit.org/browser/trunk/Source/JavaScriptCore/runtime/ArrayPrototype.cpp#L641

======
Robin_Message
It appears to me from the line:

    
    
        631 if (thisObj->classInfo() == &JSArray::s_info && !asArray(thisObj)->inSparseMode()) {
    

that only non-arrays, or arrays in some kind of "sparse" mode are sorted
inefficiently.

I'm not sure what sparse mode is, but let's try my Raymond Chen inspired
psychic powers: if you assign a[0]=1 and a[1000000]=2, you don't want a 1 to
999999 to be stored, so an array like that will end up in a sparse mode which
functions more like a hash table keyed by integers.

The same applies to non-arrays: since they aren't true arrays, they won't be
stored in the blessed, contiguous way that the sort routine is expecting, so a
slower access method has to be used.

Now, fast sorting is presumably written assuming the array is in a contiguous
array. Since in sparse mode the array isn't like that, we aren't on the fast
path, so it doesn't matter if we are slow, and correctness and special cases
are more important, hence the simple selection sort algorithm.

Note also that line 633, 635 and 637 further specialise the fast path into
three cases: a default sort with no user-defined comparator, a sort based on a
built-in numerical comparator, and a more general sort using a user-defined
comparator. (EDIT: The functions called on 633, 635 and 637 are defined in
[http://trac.webkit.org/browser/trunk/Source/JavaScriptCore/r...](http://trac.webkit.org/browser/trunk/Source/JavaScriptCore/runtime/JSArray.cpp)
from line 1409 onwards, and use quicksort or mergesort, except in the general
case which uses an AVL tree, which feels odd but I'm sure there was a reason.)

TL;DR: This only applies to certain arrays; the performance is generally fine
else _someone would have noticed by now_ ; if you are sorting array-like
things or sparse arrays regularly, profile to discover if this is a problem
for you.

~~~
saurik
Yeah... to demonstrate how painfully incorrect this submission is, I think
this comment from the real sorting code (which switches between mergesort and
quicksort for different workloads) does quite well:

    
    
        // For numeric comparison, which is fast, qsort is faster than mergesort. We
        // also don't require mergesort's stability, since there's no user visible
        // side-effect from swapping the order of equal primitive values.

~~~
ajross
If this is not the "real" sorting code, why is it there? If it's not real, it
should be gone. If it's real, it's wrong. The proper response to a code review
that finds broken code is to fix it, not excuse it.

~~~
saurik
Look, this submission states "WebKit sorts JS arrays using Selection Sort". In
fact, WebKit will sort things that are not JS arrays using selection sort (the
first check); and, as of 7 months ago, will sort JS arrays that are "in sparse
mode" as if they were not JS arrays ("non-standard" being the terminology).

I am not arguing selection sort is a good sort for the reason chosen, but I
will maintain that the real (yes: "real", that is the correct term to use
here) sorting algorithm used by WebKit for "JS arrays" (remember: I was
complaining about "how incorrect this submission is") is actually a choice
between mergesort, quicksort, and an AVL tree.

Now, if the submission title had been "WebKit sorts sparse JS arrays using
selection sort" or even "WebKit sorts non-standard JS arrays using selection
sort" (to evoke the wording of the commit message 7 months ago), that would be
different: that would not be a linkbait title. I could even _maybe_ get behind
"WebKit uses four different sorting algorithms, and one of them is selection
sort?!".

However, when one sees the title "WebKit sorts JS arrays using Selection
Sort", I think many, if not most, reasonable developers go "oh, wow, maybe I
should be avoiding the WebKit sort algorithm, given that my arrays are
reasonably sized... I wonder if they'll get that fixed"; but, when you click
through to the actual code _and read it_ you realize this would be wrong.

So, from my perspective, you are now arguing a strawman, and are doing so
fairly abrasively :(. To be clear: I certainly am not defending the existence
of selection sort in the codebase, even for non-standard arrays. That does not
make this submission correct, or even reasonable: it is looking at the wrong
code and making it out to be something it is not (even quite generally,
"important").

Regardless, I got curious, and decided to look more into that sorting code. It
is apparently sufficiently old that it predates WebKit being called "WebKit",
so I went back through kdelibs and found the original commit that added it,
from January 2001. At this time, most of the array implementation was still
"does nothing, needs to be implemented" or "does the wrong thing, needs to be
fixed".

[http://quickgit.kde.org/index.php?p=kdelibs.git&a=commit...](http://quickgit.kde.org/index.php?p=kdelibs.git&a=commit&h=4294099dd9de6b8c9dbbfc10f7e5f9ee09f2f002)

It was in March of 2003, during a commit-spree of merging from Safari back to
kjs, that the quick sort implementation and code to use it for JS arrays was
added. This means that this submission's title was correct for two years,
incorrect for almost 9 years, and then maybe-sort-of-almost-correct for 7
months.

[http://quickgit.kde.org/index.php?p=kdelibs.git&a=commit...](http://quickgit.kde.org/index.php?p=kdelibs.git&a=commit&h=b3b50763d90fc5296abf5f8efc9c0d64f83a856f)

[http://quickgit.kde.org/index.php?p=kdelibs.git&a=commit...](http://quickgit.kde.org/index.php?p=kdelibs.git&a=commit&h=00ad2659109814903d5760dcfe50c87e74732afb)

~~~
ajross
I guess the response is symmetric: I'm not defending the title of the post.
I'm saying that this is an example of code review discovering a real problem
in code, and that it needs to be fixed. You apparently agree, so yay. :)

------
js2
FWIW, this code's over a decade old. Presumably if it had any negative
performance implications, someone would have noticed by now:

    
    
      author	kocienda <kocienda@>	
      Fri, 24 Aug 2001 10:24:45 -0400 (14:24 +0000)
    
      Imported sources from kde-2.2 distribution
    

[http://git.chromium.org/gitweb/?p=external/Webkit.git;a=blob...](http://git.chromium.org/gitweb/?p=external/Webkit.git;a=blob;f=JavaScriptCore/kjs/array_object.cpp;hb=66a6d360850d1643dc51807f9c6c0e0029ce3459#l292)

~~~
sesqu
I ran a simple test with Chrome 20:

    
    
      var a=new Array(4e9);
      for (var i=1; i<a.length; i=Math.ceil(i*1.0001))
        a[i]=new Number(i).toString();
      a.sort();
    

The speed suggest that this sparse string array is not sorted with a quadratic
algorithm (today).

~~~
bdash
This is also fast in Safari because the array isn't sparse enough for
JavaScriptCore to send it down the fallback path.

EDIT: In order to go down the fallback path the array needs to be both
sufficiently large and sufficiently sparse (< 12.5% occupied). You can see the
difference in algorithms by modifying your test case like so:

Dense, fast path:

    
    
        var a = [];
        for (var i = 1; i < 4e6; i++)
          a[i * 5] = i.toString();
        a.sort();
    

Sparse, fallback path:

    
    
        var a = [];
        for (var i = 1; i < 4e6; i++)
          a[i * 10] = i.toString();
        a.sort();

~~~
sesqu
By my count, it's very sparse - only 0.003% occupied. I suppose the initial
size declaration could make it dense, but then it would use a considerable
amount of (virtual?) memory. Unfortunately, I don't have access to Safari to
test that.

~~~
bdash
Hmm… it looks like I misspoke. The "sparse mode" check that
Array.prototype.sort performs in JavaScriptCore appears to be different than
the its concept of sparse array storage, which is used under the conditions I
described. The "sparse mode" check in the sorting code appears to be related
to whether the array instance has getters/setters or properties defined via
Object.defineProperty. The performance difference my code demonstrates is due
to the extra overhead of iterating the sparse array storage compared to a
vector, and not due to the different sorting algorithms that are employed.

------
mda
V8 sort algorithm (Quicksort + Insertion Sort):

[http://code.google.com/p/v8/source/browse/trunk/src/array.js...](http://code.google.com/p/v8/source/browse/trunk/src/array.js#741)

~~~
maxpow4h
It's known as Introsort

~~~
mda
No, Introsort is (Quicksort + Heapsort) designed to prevent algorithm to
degenerate O(n^2) for particular input sets.

Afaik v8 (and Java's pre Java 7 versions) quicksort algorithms are based on
Jon Bentley's work.

------
EchoAbstract
Does any browser actually rely on WebKit's JS Core? I know Safari has Nitro
and Chrome has v8, so I'm not sure what would use this code except maybe
konqueror.

~~~
bla2
nitro == jscore

~~~
EchoAbstract
If that's the case, then why does UIWebView under-perform Mobile Safari? (I
know Mobile Safari uses nitro, but I have no idea what UIWebView uses under
the covers). Also, Apple's known for being a bit slow to contribute some
things back to WebKit. I wouldn't be surprised to see them not contributing
nitro back.

~~~
illamint
I think they're both using Nitro (depending on who you ask:
[http://ariya.ofilabs.com/2012/06/nitro-javascriptcore-and-
ji...](http://ariya.ofilabs.com/2012/06/nitro-javascriptcore-and-jit.html)),
but UIWebViews don't have JIT compilation turned on (no executable memory
pages allowed in iOS apps), so they're slower.

------
mbostock
This is part of the reason why I ported Vladimir Yaroslavskiy's dual-pivot
quicksort to JavaScript
([https://github.com/square/crossfilter/blob/master/src/quicks...](https://github.com/square/crossfilter/blob/master/src/quicksort.js))
for Crossfilter (<http://square.github.com/crossfilter>). It is much, much
faster than the native sort. I would guess that WebKit uses this slower
algorithm because it's an easy way to implement stable sort. If I recall,
Chrome was one of the earlier browsers to implement a non-stable sorting
algorithm for JavaScript and it broke various pages that assumed stable sort
(such as tables that let you sort by multiple columns).

Fun side-note: you can use a matrix diagram to visualize how your browser
sorts, letting you determine your browser's sort algorithm visually. For
example, Chrome uses median-of-3 quicksort, and Firefox uses mergesort. See
for yourself!

<http://bost.ocks.org/mike/shuffle/compare.html>

[http://www.flickr.com/photos/mbostock/sets/72157628971703067...](http://www.flickr.com/photos/mbostock/sets/72157628971703067/detail/)

~~~
bdash
Can you point me at some code comparing the performance of your JavaScript
quicksort implementation to the browser's native sort? I threw together a
quick comparison myself and couldn't see any cases where the JavaScript
implementation was significantly faster than JavaScriptCore's
Array.prototype.sort, and there were a few cases where Array.prototype.sort
was significantly faster than the JS implementation.

EDIT: One interesting case is that of sparse arrays (e.g., the arrays which
trigger JavaScriptCore to fall down the slow path that prompted this
discussion). It seems that your JavaScript quicksort implementation doesn't
handle these correctly: it gives different results from Array.prototype.sort
and is dramatically slower as well. I'm guessing the performance impact comes
from the fact that from JavaScript's point of view the holes in the array are
identical undefined values. I'm not sure why correctness would be affected
though.

------
damian2000
I don't know much JS but shouldn't there be a quicksort library function for
that?

------
rodneyrehm
[http://blog.rodneyrehm.de/archives/14-Sorting-Were-Doing-
It-...](http://blog.rodneyrehm.de/archives/14-Sorting-Were-Doing-It-
Wrong.html#stability) may be of [some] interest here, too.

------
jere
> // "Min" sort. Not the fastest, but definitely less code than heapsort // or
> quicksort, and much less swapping than bubblesort/insertionsort.

Oh, OK. Glad you could save yourself some time/LOC.

~~~
calinet6
While you're correct to criticize to a degree, there is also something to be
said for reducing complexity and preventing bugs by avoiding unnecessarily
long algorithms.

------
danso
FWIW, this animated demo of the comparative sorting efficiencies
<http://www.sorting-algorithms.com/>

------
lucian1900
Interesting. Perhaps a quicksort/heapsort/mergesort should be written in JS
instead and used by webkit for large collections. It'd be a lot less code than
in C++, that's for sure.

~~~
rodneyrehm
While I doubt a javascript-space sorting implementation runs any faster than a
C++ implementation could, there are some implementations around,
<http://millermedeiros.github.com/amd-utils/array.html#sort> for example.

~~~
simonster
JavaScript sorting algorithms can definitely be faster if you are passing a
comparator function to Array.prototype.sort, depending on the JS engine. See
[https://groups.google.com/forum/#!msg/mozilla.dev.tech.js-
en...](https://groups.google.com/forum/#!msg/mozilla.dev.tech.js-
engine/Ct6JsfJEjdI/1Vz5LJxnkXYJ), where Boris Zbarsky shows that translating
quick sort into JavaScript produces an algorithm that runs 7X faster than
Array.prototype.sort(function(a, b) { return a-b; }) in SpiderMonkey (which
uses an insertion + merge sort in C++ for Array.prototype.sort) because of the
expense of making calls from C++ to JS. Chrome's sort algorithm is written in
JavaScript so it doesn't suffer from this.

