
Quicksort (1961) [pdf] - jpelecanos
https://www.cs.ox.ac.uk/files/6226/H2006%20-%20Historic%20Quicksort.pdf
======
dsacco
It's fascinating that this is only six pages long. The complexity analysis and
corresponding math is sufficiently light as to be both straightforwardly
comprehensible and inline with the body of the paper (instead of thrown into a
"Theorem Appendix" at the end). The entire paper is exceptionally readable.

In contrast, new research that could be called "fundamental" in algorithms and
theoretical computer science is typically several times longer and more
complex. On one hand it seems intuitive that the more mature a field is, the
higher the prerequisites for review and contribution. Computer science papers
from the 60s seem much easier to read these days that what's been put out in
the JACM since 2000, and modern papers in pure mathematics research are dense
and incomprehensible (and commensurately more difficult to publish results in)
even when placed next to modern research in computer science.

On the other hand, a fundamental improvement to Quicksort was developed in
2009 and the author's paper comes in only five pages [1]. Is that because the
improvement is "minor", because the author is exceptional at explaining the
fundamental ideas clearly and succinctly, or because the author was relatively
lazy about putting in the things usually demanded for publication (long
sections on methodology, historical context, problem motivation, complexity
analysis in different implementation contexts...)? It's hard to know whether
some of these academic papers actually require their significant length, or if
the presentation of material is simply inefficient. Cuckoo hash [2] was
developed in 2001, but its paper comes in at 26 pages despite it being a
fairly intuitive construction. The authors didn't just explain the fundamental
result, they included two full pages of reference citations, a section on
empirical performance characteristics and complete details of their
experimental lab setup.

I'm not advocating that modern papers should necessarily be shorter, but I
think it's an interesting question.

__________________________

1\. [http://codeblab.com/wp-
content/uploads/2009/09/DualPivotQuic...](http://codeblab.com/wp-
content/uploads/2009/09/DualPivotQuicksort.pdf)

2\.
[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.25....](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.25.4189&rep=rep1&type=pdf)

~~~
feelin_googley
This is why, when new to a subject area, I prefer reading older papers before
trying to read anything recent.

Gratuitous complexity builds over time. (And there is no subject area where I
have seen such gratuitous complexity as with computers.)

I have always thought this was because either authors assume knowledge of the
earlier papers and/or they have become "too familiar" with the concepts, but I
could be wrong. In the later case, perhaps an analogy would be a singer who
has to perform the same song thousands of times. When an artist sings it for
the 999th time it may sound different and maybe there is assumption everyone
has heard it before.

Whatever the reason, it is common to see the fundamental concepts either
glossed over or not even mentioned in recent papers.

The exception to this, for me, where subsequent papers are helpful, is when
they succinctly summarise prior research or track a subject area over time,
e.g., annual reviews.

Anyway, when I am searching for older papers, it often feels
"counterintuitive" since I see many people studying an area for the first time
who want only the latest research. Perhaps an analogy here could be software.
There seems to be a fascination on HN with software that is "new" and being
changed on a daily basis versus software that is finished and has been in use
for a long time.

A related idea is university textbooks. Many students believe they must have
the latest editions, for a variety of reasons. However the changes from one
edition to the next may be relatively small and easily summarized. IME,
sometimes a topic may be more clearly explained in earlier editions. Reading
the earlier edition's treatment of a topic followed by the later edition's
treatment is sometimes illuminating in ways that reading only the latest
edition is not.

In summary, I find it easier to read the old, 1-3 page paper that lucidly
discloses an original concept and then build understanding from there, reading
subsequent research, than to read something written, e.g., this year and
assume that old research, by virtue of it date, is just a minor detail and no
longer important.

~~~
gnufx
Right. I've noticed in multiple areas of science that often the pioneers of
some technique/theory knew how to do it properly, but lacked the right
computational, or other, technology, and had to fudge it in some way they were
clear about, but others aren't. People then persist with that unsatisfactory
approach, or data produced with it, long after they shouldn't. Reading the
original work is definitely worth risking a little wasted time.

------
jkuria
I had the pleasure of attending a talk by Tony Hoare when I was at Microsoft
in 2007. He was a staff member at Microsoft research UK and was visiting the
Redmond campus. Having studied and appreciated quick sort in freshman CS it
was like being in the presence of God :)

------
AJRF
Tony was 27 when he came up with QuickSort. Impressive

------
josephv
Quicksort should be the first sort you try on your rando dataset. The native
search implementation is nearly always competitive with the best algorithm for
your particular lunch. And it can be tweaked to specialize on datasets.

~~~
fnbr
How can you tweak it to specialize?

~~~
slaymaker1907
Probably on how you select your pivot.

------
vander_elst
Interestingly there is no pseudo code describing the solution.

~~~
robotresearcher
There's a reference to a paper that contains ALGOL code instead.

------
Ono-Sendai
Quicksort is great, but the name is not great. It would have been better named
as 'partition sort'.

------
ericfrederich
6:47 minutes to sort 2k items.

We've come a long way. Python isn't even the fastest of languages

    
    
      $ time python3.6 -c "import random; l = random.choices(range(1_000_000), k=2_000); l.sort()"
      
      real	0m0.033s
    

... and this also takes into account creating the 2k random ints.

~~~
kccqzy
And Python doesn’t even use quicksort any more.

------
unicorncode
Nice find, bookmarked for nostalgia (even tho I wasn't born yet).

