A Killer Adversary for Quicksort (1999) [pdf] 52 points by beagle3 on Oct 30, 2015 | hide | past | web | favorite | 22 comments

 I used this technique to prove libc++'s implementation of std::sort is O(n^2). There's still an open bug report for this in libc++: https://llvm.org/bugs/show_bug.cgi?id=20837.On a related note, I'm talking to the libc++ and libstdc++ developers about replacing introsort with pdqsort (https://github.com/orlp/pdqsort) as the implementation for std::sort. It solves the worst case in a subtly different (and potentially more efficient) way than introsort. Although that's not the real selling point.EDIT: I forgot I still use my old username on hacker news. To prevent confusion, I'm also the author of pdqsort.
 I made an implementation of this algorithm as a university assignment;here's a vanilla implementation of Quicksort https://github.com/leonid-shevtsov/uni-programming/blob/mast...and here's the function to generate data that causes the custom implementation to go O(n^2) https://github.com/leonid-shevtsov/uni-programming/blob/mast...
 I've had bad commercial Quicksort implementations go N^2 on constant data, http://www.win-vector.com/blog/2008/04/sorting-in-anger/ . Kind of sours one on the "Quicksort is essentially elegant" idea.
 I don't get how you could create an adversarial input in advance for an implementation that uses a randomized-partition (index of pivot drawn from a uniform distribution).Can anyone fill me in?What this article describes (for that, as far as I can tell) is an adversary that modifies the input while quick sort is working. How realistic is that?
 That algorithm is only used to create an input pattern. When it is fed back to the same sorting function, it will cause it to go quadratic. If your target is using the same sorting function it will do the same on their side.This attack is realistic. I have tried it on the glibc qsort, generic quicksort, and Visual Studio qsort, and they were vulnerable.
 I'm interested in the way you use the word "attack."I was looking at it from the perspective of you could never run into an input that would consistently sort in O(n^2) using randomized-quicksort.But your perspective (maybe this article's perspective) is that some malicious adversary could force the sorter to do extra work by knowing in advance the steps that they will take. Am I getting that right? Like from a flops security stand point?If so, why care about this situation? This only seems like a realistic scenario in very weird domains, like amazon web services messing with everyone's calls to some standard sorting code to charge them more cents for compute time or something...
 If I'm not mistaken, the attack is aimed at a quicksort implementation where the pivot value is chosen as a median of first, middle, and last values. It would not work on randomized quicksort, of course.
 Take a look at Denial-of-service attack.
 What's the problem? From D. Knuth's The Art of Computer Programming: Sorting and Searching,(1) if sorting n records by comparing pairs of keys, then the Gleason bound shows that on average can't sort faster than O( n ln(n) ).(2) Heap sort sorts n records in both worst case and average case in time O( n ln(n) ).Net, when sorting by comparing pairs of keys, in main memory, on a simple machine architecture, with a single processor, ..., can't improve on Heap sort.So, just use Heap sort. That's what I do.
 Big O hides all constants. A hybrid algorithm using Quicksort for the heavy lifting is O(n log n) in the worst case, but is a significant amount faster than heapsort - mainly due to cache effects.See pdqsort for examples (benchmarks included): https://github.com/orlp/pdqsort.It shows that on average it's twice as fast on the random input distribution.
 Now and long the usual criterion of performance is big-O notation, and that is what I considered, with the Gleason bound and heap sort and, thus, showed that, with various common assumptions, can't do better than heap sort. As far as I know, I was correct.So, to claim to beat heap sort, have to set aside some of the usual assumptions and the big-O criterion.Quicksort is great with a cache since as it has great locality of reference when working with the partitions that fit into the cache. But, worst case of Quicksort is still O( n^2 ) so what happens if Quicksort encounters such a partition?Just intuitively, it appears that heap sort has a tough time making much exploitation of usual cache designs. We know that. There is a modification of heap sort that helps it work better when have to do virtual memory paging.I used to get torqued at considering big-O, and did this just because it ignores constants; as a programmer, I worked hard to do well with such constants. But in the end I gave in and gave up and just accepted big-O notation as the most appropriate, simple, single criterion. And, for judging algorithms by how they do with cache exploitation, I gave up on that, also: Or, broadly it's the responsibility of such speedup techniques to work on ordinary code written for a simple processor architecture, not the responsibility of such code to do tricky things to exploit tricky architectures.In the land of big-O, a factor of only 2 is considered not very exciting.If relax some of the assumptions, then can beat heap sort and O( n ln(n) ): Use radix sort. That was the sorting algorithm of the old punch card sorting machines. E.g., if have 100 billion records to sort where each key is an alpha-numeric string 10 characters long, then radix sort does all the work in just 10 passes of the 100 billion records. Or for keys m characters long, can sort n records in O( nm ). So that's faster when m < ln(n). Or, to make the arithmetic easier, m < log(n). But log(100 billion) = 12, so radix sort should be faster with m = 10 and n = 100 billion.So, for a hybrid sort routine, consider also radix sort.Am I excited about radix sort? Nope!
 > But, worst case of Quicksort is still O( n^2 ) so what happens if Quicksort encounters such a partition?I'd strongly suggest you to read the readme I wrote for https://github.com/orlp/pdqsort, which explains in detail how I beat heapsort (and other sorting algorithms). It also shows how I prevent the O(n^2) worst case.> I used to get torqued at considering big-O, and did this just because it ignores constants; as a programmer, I worked hard to do well with such constants. But in the end I gave in and gave up and just accepted big-O notation as the most appropriate, simple, single criterion.We have already established that O(n log n) is the lower bound, and we have reached this lower bound. This is where big O stops being useful. A metric that compares equal for every relevant algorithm (mergesort, introsort, pdqsort, timsort, smoothsort, heapsort) is not a useful metric.> In the land of big-O, a factor of only 2 is considered not very exciting.I live in the land of real-world performance, powered by actual benchmarks. In this land a factor of 2 is very exciting.> So, for a hybrid sort routine, consider also radix sort.I worked in the scope of std::sort, which is strictly a comparison sort.
 > We have already established that O(n log n) is the lower bound, and we have reached this lower bound. This is where big O stops being useful. A metric that compares equal for every relevant algorithm (mergesort, introsort, pdqsort, timsort, smoothsort, heapsort) is not a useful metric.With the explosion of cores and special purpose hardware, it is interesting to think of how long before sorting network routines are more practical. Dropping the runtime to O(lg n) would be enticing. :)
 I'm impressed you can write such a long story yet address none of my points. You should consider becoming a politician.The only relevant piece of information I could find in your anecdote is this:> I assumed that we were talking about in place sorting.If we limit ourselves to in-place sorting algorithms then we still are left with block sort, pdqsort, introsort and heap sort, all of which run in O(n log n). My argument still holds.
 > If I understand this statement correctly, I suspect that for some arrays of input keys, the resulting algorithm won't actually sort the keys. That is, we won't have a sorting algorithm.You don't understand the statement correctly. However, this is not a trivial realization.Because pdqsort always recurses on the left partition first it means that every element to the left of the current recursive call is equal to or less than every element within the current recursive call. So if we find that the selected pivot is equal to an element before the current recursive call we can conclude that there can only be elements equal to the pivot, and not smaller, in the current recursive call.> Maybe your remark "astronomically small" is just a side comment and not relevant to PDQSORT having worst case running time of O( n ln(n) ).Correct.> For this and the associated documentation, I don't see a solid argument that PDQSORT has worst case running time of O( n ln(n) ).There is a very simple argument. The 'bad partition counter' is initially log(n). If we encounter a bad partition, we can do up to O(n) work. This can happen log(n) times. So we can do up to O(n log n) work on bad partitions. If we then encounter another bad partition, we switch to heapsort, which is O(n log n).If we encounter less than log(n) bad partitions Quicksort performed well enough to be O(n log n).For a similar approach, read up on introsort.
 Okay.
 > In the land of big-O, a factor of only 2 is considered not very excitingIn the land of solving real-world problems a factor of 2 is extremely significant, especially when you're trying to get the most performance out of a power-constrained system. In fact, even a 1% increase in performance is significant.
 > Now and long the usual criterion of performance is big-O notation, and that is what I considered, with the Gleason bound and heap sort and, thus, showed that, with various common assumptions, can't do better than heap sort. As far as I know, I was correct.That's a little short sighted for real life performance. Memory access patterns, cache performance, and the hidden constant factor are all relevant for real performance. For example there's a 200x difference between main memory and cache access times. log2(1000000000000) is ~40, meaning poor cache performance can impact performance by more than a factor of log(n) in the Big-O notation.There are also many cases where algorithms with better Big-O performance have constant factors so large that they perform worse for common problem sizes. Matrix multiplication is a good example. Strassen's algorithm has a better Big-O than the straightforward algorithm, but the constant factor is large enough that it only becomes faster on matrices larger than maybe 100x100. And there are algorithms even better than Strassen's, but their constant factors are so big they're effectively never better than Strassen's on real hardware.
 You should check out the sort your parent linked. It talks all about what happens and the overhead involved. ;-)
 In theory, there's no significant differences between theory and practice. In practice, things are never so simple.When you use Big-Oh notation, O(f(n)), you claim that there exist one function F(n) that accurately characterizes the running time of an algorithm based on the size of the input, and that f(n) is the higher order component of that function. I looks something like this:`````` F(n) = A*f(n) + B*g(n) + C*h(n) + ... + K where A,B,C,..K are constants and f,g,h,... are functions. `````` You approximate F(n) ~= Af(n) because for n big enough, this component will dominate the growth of such function. But A is different for different algorithms... so, in the case of sorting, you have several options - quicksort, mergesort, heapsort - that are in paper all "the same" but each has a different constant, and there are real advantages of using the one with the smallest constant.Furthermore, if you are not trying to sort infinitely big data (most real world cases) you must consider that other components of the actual F(n) function, such as g and h, may dominate the performance of the algorithm in the range of data sizes that you are more likely to encounter. There are other measures of algorithmic complexity that deal with this (I recall there is one Small-Oh notation, among others), but they don't teach you that at school unless you are a posgraduate student doing highly relevant work in complexity theory. I suspect most guys, like myself, find out by not taking Knuth as holy gospel and reading further material.And then there are the degenerate cases, where your data can get sorted in quasi-linear time if you apply one of the slow dumb O(n^2) dinosaurs.

Applications are open for YC Summer 2019

Search: