
A numerical analysis of Quicksort: How many cases are bad cases? [pdf] - edofic
http://arxiv.org/pdf/1507.04220v1.pdf
======
coherentpony
If you're going to post articles from arxiv, please post the top-level
description so I can read the abstract before being forced to download a PDF
document.

[http://arxiv.org/abs/1507.04220](http://arxiv.org/abs/1507.04220)

------
BetaCygni
> An attempt to solve this problem has been randomization as shown already in
> Hoare’s first articles on Quicksort [1]. On the other hand, this does not
> change the statistical probability for bad cases.

I really like the randomization solution. Worst case? What worst case?

~~~
matslina
Well, the worst case is still there. You're just going to have a very hard
time crafting an input that triggers it.

------
Kenji
I remember, at uni, in the second semester, we had an assignment to
specifically craft a sequence of n numbers 1..n that would trigger the worst
case (#permutations) in a quicksort algorithm that always takes the first
number in the array as the pivot. It was a nightmare. I spent days on it, and
I was barely able to scrape together a notation to specify such a series.
Turns out the sample solutions were like "The appropriate sequence is not easy
to write. The sequence must be designed such that every chosen pivot halves
the area that will be stored." Well, yeah.

EDIT: I wrote worst case #permutations. That's the best case #comparisons. So,
no, I don't mean a sorted sequence ;)

~~~
rachbowyer
If quicksort always takes the first element in the array as the pivot, then an
array that is already sorted is the worst case.

------
rachbowyer
If Quicksort worst case performance is a problem then use Introsort
([https://en.wikipedia.org/wiki/Introsort](https://en.wikipedia.org/wiki/Introsort)),
which the paper fails to mention.

------
lsiebert
It strikes me that, if three way partitioning is useful only if you are
dealing with a limited set of values in relation to n, you could hash or
insertion sort into an array the values for the first pass of the algorithm
over the whole array (stopping if the count of uniques got bigger then some
value based on n), and then decide to threeway partition if you hadn't
stopped.

I feel like the recursive median of medians throws away a great deal of
information, given all the comparisons you make. At the very least, for the
final step, you know where the high and low values go, and could easily place
them in the appropriate sides of the array.

------
coreyp_1
I love quicksort, and enjoyed the paper. Yes, I'm a nerd. Thanks for asking.

