
Bubble Sort: An Archaeological Algorithmic Analysis - DyslexicAtheist
https://users.cs.duke.edu/~ola/bubble/bubble.html
======
tmoertel
I have an amusing archaeological anecdote about the bubble sort.

Around 1990, on my first real job out of college, I was assigned about 100K
lines of C code to maintain. I was told that it had problems scaling up to
larger networks (it was a network mapping system).

While going through the code, I discovered that it contained a bubble sort
that had been commented out. The comments said that it had been replaced with
hand-rolled sorting logic in 68K assembly for "additional speed."

So I found the assembly code for the new sorting logic. It was also a bubble
sort!

In disbelief, I replaced all this nonsense with a call to the standard
library's `qsort` function. The scaling problems disappeared :-)

~~~
gnulinux
Saved this quote to show to my future data structures students. Thanks.

------
kragen
I remember when I first took a CS class, I spent an hour in the library
attempting to bubble-sort microfiches, much to the annoyance of my supervising
librarian, who recommended a radix sort in very strong terms. Unfortunately I
had more trust in my incorrect understanding of my CS textbook than I had in
her, though she was correct.

The article unfortunately omits actual code for insertion sort and selection
sort, except the "jump-down" variant of selection sort it mentions early on.

Bentley gives the "simplest insertion sort" as follows (translated to modern
C; the version in the book is pseudocode):

    
    
        for (int i = 1; i < n; i++)
            for (int j = i; j > 0 && x[j-1] > x[j]; j--)
                swap(j-1, j);
    

I think that variant is only faster than bubble sort because it stops
iterating once the newly inserted item has reached its final resting place, on
average halfway through the sorted part of the list, so it only does about n/4
comparisons and swaps per outer-loop iteration, rather than bubble sort's n or
n/2\. A better approach — usually the fastest sort for short arrays — stores
the element being inserted in an auxiliary variable until its final resting
place is found, thus avoiding almost half the work of swapping. Bentley
reports from the previous millennium that this is a speedup of about a factor
of four:

    
    
        for (int i = 1; i < n; i++) {
            data t = x[i];
            int j;
            for (j = i; j > 0 && x[j-1] > t; j--)
                x[j] = x[j-1];
            x[j] = t;
        }
    

A similar optimization is available for selection sort, which is otherwise
equivalent to bubble sort, necessarily examining as it does almost n/2
elements on average to produce each output element.

There's an even simpler single-loop sort algorithm (I forget what it's called)
which is unfortunately O(N³):

    
    
        for (int i = 1; i < n; i++)
            if (x[i-1] > x[i])
                swap(i-1, i), i = 1;
    

That won't work in Python, of course, but opinions may vary on whether we
should be sorry or glad about that.

~~~
vram22
>Bentley gives the "simplest insertion sort" as follows

Is that from one of his books (e.g. Programming Pearls, More Programming
Pearls, Writing Efficient Programs), or from some paper he published? I've
read parts of the first two books and most of the third book, and all are
great.

Shaker sort is an interesting variation of bubble sort (and maybe also
selection sort, according to its Wikipedia page) that sorts in both
directions.

[https://en.wikipedia.org/wiki/Cocktail_shaker_sort](https://en.wikipedia.org/wiki/Cocktail_shaker_sort)

Only slightly more complex than bubble sort, which is itself simple.

~~~
kragen
Sorry, I meant Programming Pearls, 2nd Edition, Column 11 (Sorting), §11.1.

Shaker sort is potentially interesting for cases where you're adding a few
out-of-order items to an otherwise sorted set, but I think insertion sort
still beats it even in that case.

~~~
vram22
Thanks. Yes, I didn't mean that shaker sort is good for speed, just
interesting for the approach.

Also, bubble sort, though one of the slowest, has this property:

[https://en.wikipedia.org/wiki/Bubble_sort#Use](https://en.wikipedia.org/wiki/Bubble_sort#Use)

[ In computer graphics bubble sort is popular for its capability to detect a
very small error (like swap of just two elements) in almost-sorted arrays and
fix it with just linear complexity (2n). ]

~~~
kragen
Insertion sort has that capability too, without bubble sort's drawbacks.

~~~
vram22
Interesting, didn't know. Will look again at its Wikipedia page. Had looked
recently, may not have noticed that.

Apropos of insertion sort, I think I read (it was years ago) in the Jon
Bentley book (which I think I mentioned upthread), called Writing Efficient
Programs, that he said insertion sort is used by some people in quicksort, as
the alternative algorithm, when the sizes of the subarrays remaining to be
sorted (recursively) are down a small size, in which case insertion sort has
less overhead than quicksort, so is faster for those small chunks. (Or it
could have been selection sort (instead of insertion sort.) In other words, a
hybrid sort algorithm.

~~~
nwallin
It's actually slightly cleverer than that.

When quicksort recurses into a subarray that is less than the critical size,
(call it k) it does not switch to a simpler O(n^2) sort: it simply returns,
leaving the subarray unsorted. The result is that the final array is mostly
sorted, where no element is more than k distance from its true position. Once
the array is mostly sorted, it then runs insertion sort, which is fast on
mostly sorted arrays. Despite the fact that in the general case insertion sort
is O(n^2), it is O(nk) on this mostly sorted array.

Selection sort performs no better on mostly sorted arrays than insertion sort.
Which is why insertion sort is used instead of selection sort.

~~~
vram22
Wow, that is clever. Not doing more than you need to - for the same effect -
kind of thing. Thanks for sharing.

I've been playing around with various simple algorithms for different tasks,
over a period, and been noticing that there is a lot of scope for improving
efficiency, even if by small amounts, sometimes by a bit more, by making
various tweaks to the logic, often including early termination techniques.

A well-known example of early termination is: to find if a number n is prime,
instead of checking whether it is divisible by all the integers from 2 to n -
1, you can just check for those up to the square root of n. (There are more
optimizations on those lines possible, of course.)

I've found that there can be many quick wins in this area, often in various
algorithms that process arrays. For example, problems involving arrays on
Rosetta Code.

------
agf
I discovered sorting algorithms (including bubblesort) from SORTDEMO.BAS --
[https://www.youtube.com/watch?v=leNaS9eJWqo](https://www.youtube.com/watch?v=leNaS9eJWqo)
\-- which came with QBasic.

------
wolfpwner
I remember it's the first sorting algorithm taught in my high school computer
science class.

~~~
mark-r
It probably was in mine too, for sure it's the one I remembered the most. I
also remember trying Shell sort but the only thing I remember about that one
is the name. Merge sort is the one I've gotten the most mileage out of.

------
JKCalhoun
I swear if <insert language> C's <insert relevant language library> stdlib
didn't come with a <insert Quicksort function and or method> qsort() I would
have shipped a lot more code with a hand-rolled bubble sort implementation of
my own lazy design....

------
clvtor
few sorting links with explanations and visualizations:

[https://imgur.com/gallery/RM3wl](https://imgur.com/gallery/RM3wl)

[https://panthema.net/2013/sound-of-sorting/](https://panthema.net/2013/sound-
of-sorting/)

[https://www.creativeapplications.net/processing/sorting-
visu...](https://www.creativeapplications.net/processing/sorting-
visualisation-sortification-and-sonification-of-an-algorithm/)

