

Writing a Non-Blocking JavaScript Quicksort - josep2
http://www.breck-mckye.com/blog/2015/06/writing-a-non-blocking-javascript-quicksort/

======
tantalor
The obvious solution is to use a Web Worker,

    
    
      var code = "onmessage = function (evt) {evt.data.sort(); postMessage(evt.data)}";
    
      function asyncSort(data, cb) {
        var worker = new Worker(URL.createObjectURL(new Blob([code])));
        worker.onmessage = function (evt) { cb(evt.data); };
        worker.postMessage(data);
      }
    

Example,

    
    
      asyncSort([3, 2, 1], function (res) { console.log(res); });
    

Prints,

    
    
      "[1, 2, 3]"

~~~
amelius
It is not efficient to use a webworker when transferring the data into it and
out of it takes O(N) time.

~~~
ougawouga
Not with transferable objects.

~~~
amelius
From [1]:

> when transferring an ArrayBuffer from your main app to Worker, the original
> ArrayBuffer is cleared and no longer usable

This is really too restrictive in many situations.

[1]
[https://developers.google.com/web/updates/2011/12/Transferab...](https://developers.google.com/web/updates/2011/12/Transferable-
Objects-Lightning-Fast)

~~~
RandomBK
That may be what the API requires, but is it actually implemented as a
physical move/copy of memory? It sounds like they can just make the array
_appear_ to be cleared, while keeping it in place in memory.

~~~
yoklov
The way I understand it is that they usually just move the pointer to the
ArrayBuffer's underlying data store. No block copy occurs.

------
stestagg
I've actually got a bunch of sort algorithms written as async javascript:

[http://stestagg.github.io/sort/](http://stestagg.github.io/sort/)

------
KayEss
The 'long tail' really oughtn't be that long if they switched to the native
sort when a partition becomes small enough.

~~~
drostie
That's going to be tricky to do with quicksort... you could definitely do it
with mergesort, though, but you'd have to abandon the in-place guarantee.

The basic problem is that Array.prototype.sort() does not accept indexes for a
slice of the array to sort between.

~~~
KayEss
It's simple enough if you split the list at the partition point, but hard if
you do it in place.

------
mpweiher
Is computation becoming a special case of being blocked (on I/O)?

I remember seeing this in an event-driven server framework (computation as the
special case), this usage of "non-blocking" suggests it might be a trend.

~~~
mattdw
In the browser, Javascript (outside of webworkers) is single-threaded, and
while running it blocks the browser run-loop (for the given webpage.) This
means the page can't accept or respond to any incoming events. It's "blocked"
from interaction.

~~~
mpweiher
Yes, I am aware of that.

It used to be "blocked" meant "process is waiting for an external event,
usually I/O". Doing computation was typically referred to as "busy". "Non-
blocking" meant "without putting the process to sleep".

There seems to be a shift in the meaning of this particular piece of
terminology that I find interesting, because it seems to reflect the reality
that "computers" actually do very little "computing". Instead computers mainly
communicate, and even the CPUs themselves probably spend most of their time
waiting on main memory (60ns) rather than doing actual arithmetic (sub ns in
most cases).

~~~
couchand
Just to get a little more specific with this: the shift is in the accepted
"main engine of computation" (to make up a mediocre term). We used to have
something of an engineer's viewpoint, that the physical location of the
electrical operation of the mathematics (i.e., the CPU) was this main engine.
It's been a long road, but the new viewpoint is a bit more philosophical: the
main engine is the human-computer nexus.

Of course, "human" in the above isn't strictly correct, since we know that
computers talk to each other all the time, too. Indeed, when that happens the
main engine of computation isn't really one or the other of the computers, but
the link between them (i.e., the protocol).

So now "blocking" means requiring the main engine of computation to "sleep"
and wait for "external input". For a human UI this means the classic sense of
waiting on the wetware end, but it can just as easily mean waiting on the
hardware end.

We strive to build responsive systems, so we wish for them to be "non-
blocking" in every case. This means, if the human wants to communicate to the
computer, it shouldn't be prevented to because the computer is currently
working on a solitary operation. Ultimately it's the same principle we've
already accepted from the reverse direction.

------
gridspy
You need to add a test at the start of your quicksort function - when the
number of elements for this invocation is < 500, invoke native sort on the
current subset.

------
chejazi
The main takeaway for me was lesson #4:

"setTimeout and browser timing are deceptive and shouldn’t be wholly
trusted"... and as a result, use the setImmediate API.

~~~
chejazi
Eek... "[setImmediate] is not expected to become standard, and is only
implemented by recent builds of Internet Explorer and Node.js 0.10+. It meets
resistance both from Gecko (Firefox) and Webkit (Google/Apple)."

[https://developer.mozilla.org/en-
US/docs/Web/API/Window/setI...](https://developer.mozilla.org/en-
US/docs/Web/API/Window/setImmediate)

~~~
mikekchar
Wah! I just skimmed through the Gecko bug that is linked from the above link.
I recommend reading reading it through if you are interested in using these
techniques on multiple browsers.

It's a bit of a mess. I think there is some misunderstanding here of what the
requirements are. I should be able to queue callbacks and have them execute
using 100% of the CPU, but still yield to browser and IO events so that the
broswer (or IO processing) is responsive.

A lot of the talk is about minimum waits. For example setTimeout(0) apparently
has a minimum timeout according to the spec. It's not that I want the minumum
wait to be any particular value. It's that I want it to be as small as it can
be while still yielding to browser and IO events. While I have callbacks
queued and executing, I want the CPU to be pegged at 100%. While it is pegged
at 100%, the browser should still be responsive if my callbacks yield often.

So reading the thread it seems that on Gecko setTimeout(0) does not max the
CPU (because it is waiting -- I'm going to give this a try next time I get a
chance). Also using .then() on a native Promise does not seem to yield to the
browser (apparently this is what the spec asks for).

~~~
gpvos
One of the comments in the Gecko bug links to a post by Chromium developer
James Robinson, which explains a lot of the technical details quite clearly:
[https://groups.google.com/a/chromium.org/d/msg/blink-
dev/Hn3...](https://groups.google.com/a/chromium.org/d/msg/blink-
dev/Hn3GxRLXmR0/XP9xcY_gBPQJ)

~~~
mikekchar
Thanks for this. The rest of the thread is also very illuminating. I finally
understand what they are opposed to. It makes sense.

------
drostie
> There’s just one problem – now we’ve made the function asynchronous, how do
> we know when it has finished?

The most obvious thing is just to call out to a (semi-) global:

    
    
        function quicksort(arr, cb) {
            var thread_balance = 1;
            function thread(start, end) {
                if (not trivially solved) {
                    partition_stuff;
                    thread_balance += 1;
                    setImmediate(...);
                    setImmediate(...);
                } else {
                    thread_balance -= 1; // this thread is done.
                    if (thread_balance === 0) {
                        cb(arr);
                    }
                }
            }
            thread(0, arr.length);
        }
    

This is why concurrent datatypes are often interested in simple registers that
you can only increment/decrement, they are still super-helpful for
coordinating when a workload is complete.

Taking this a little further, we can wrap setInterval in a Promises library
which gives you back a promise for the sorted halves of the array; the
thread(start, end) promise resolves with the [start..end-1] indices being
sorted, and in the nontrivial case returns Promise.all([thread(start, pivot -
1), thread(pivot, end)]) (the promise-library's merge of the two promises to
complete the work). Same idea really.

~~~
mattLummus
I was also curious about the lack of promises

------
scotty79
Wouldn't it be faster to run built-in sort on each 1/8 of array, than each
1/4, 1/2 and the whole array in the end?

Does Array.sort run much faster when the array already has some order?

EDIT: It doesn't. Sorting even totally sorted array takes as much as sorting
shuffled array.

------
amelius
Instead, one could use an altjs compiler to generate nonblocking code. For
example, ghcjs does this.

