I read the documentation he linked to. It seems odd to me that there was no mention of a N-Pivot quicksort. Unless I missed it. It seems like the burden would be on proving that 2 was the optimal value of N
With two pivots, you can start the partitioning at either end and have a well defined location for each of the three partitions (at the top, at the botton, and inbetween). With more than three partitions, you would not know where to put the second to second last border without first sorting the array. You could just guess, but that would probably result in quite a lot of additional array shifting (lucky if you have a dedicated blitter chip for that :-)).
You can make a quicksortish sort algorithm with N log N worst case by letting the number of pivots increase with N. (Details and proof, of course, in Knuth. The details are nontrivial but not all that painful.) I don't think anyone's ever proposed that that's worth doing in practice.
Using more pivots is going to make the code more complicated. The benefits of moving from one pivot to two seem to be real but not enormous; I'd be pretty surprised if moving to three pivots produced benefits worth the cost. But of course that's just handwaving, and indeed it would be interesting to measure.
My copy of Knuth does only mention that algorithm with a reference to a 1970 paper.
But what I woukd find more intereting is how the parallelizable Batcher and Pratt algorithms decsribed by Knuth fare on modern multicore machines with their weird cache hierarchies.
If you determine the number of pivots when you write the algorithm, they could be sorted in constant time. It would just require a lot of if-else branching...
int third = len / div;
third=third>0?third:1;
// "medians"
int m1 = left + third;
int m2 = right - third;
Instead of the worse:
int third = len / div;
// "medians"
int m1 = left + third;
int m2 = right - third;
if (m1 <= left) {
m1 = left + 1;
}
if (m2 >= right) {
m2 = right - 1;
}
I especially appreciate the proof- above and beyond the call of duty. However, I wouldn't be surprised if this was still slower in most real-world cases than timsort.
Java can use multiple sorting algorithms with different trade-offs. Merge sort algorithms have a good worst case scenario, but are somewhat slower in many average cases. They also need more memory. By contrast quick sort algorithms have very bad worst cases, but tend to be faster in the average case.
The bug you pointed at was a replacement of Java's merge sort with a merge sort that does a better job of speeding up when it runs across runs of already sorted data. This is a replacement of quick sort with a variation that has better average running time.
So what is the difference between a merge sort and a quick sort? Well both are divide and conquer algorithms. Merge sorts are based on splitting a list into two halves, sorting each half, then merging the two lists. Quick sorts are based on taking an element from a list called a pivot, then dividing the list into elements that are larger, smaller, or equal to the pivot. The larger and smaller lists are then sorted in turn. If you're clever about how you set things up you can do a quick sort in place, and this is faster than having to assign then free up memory.
This algorithm is a variation on quick sort that uses 2 pivots rather than 1. The win over a traditional quick sort is that you can sort in place with the same number of comparisons and less moving of data around in memory. This is a win in two ways. First of all you do less moving data around. Possibly more important is the fact that if you're doing the same number of comparisons but swapping less often, then those comparisons have more predictable results. If you understand CPU architecture you'll note that this means fewer pipeline stalls, which really helps performance.
I thought "fat pivot" meant the approach to quicksort where you partition into { smaller, equal, larger } and then sort the three subarrays instead of two. I'm not sure what the "fat pivot problem" would be. Anyway, this new approach is a generalization of that -- fat pivoting is the special case where the two pivots are equal.
I think you're right. I'm not sure if it's significant (because potentially there are fewer passes), but you end up having to swap more for a single pass. You also can get a stable in-place sort by treating the equal elements specially, which is why I'd do it.