
“Use bubble sort for insertion sets, because it's faster than std::stable_sort” - mmastrac
https://bugs.webkit.org/show_bug.cgi?id=150828
======
hyperpape
This touches on an interesting topic: how robust implementations of most
algorithms go well beyond what you're taught in an algorithms course. There
are tons of optimizations that matter when dealing with real world data that
don't matter for algorithmic complexity.

Two links on the subject:

Bentley and McIlroy on Engineering a Sort Function:
[http://cs.fit.edu/~pkc/classes/writing/samples/bentley93engi...](http://cs.fit.edu/~pkc/classes/writing/samples/bentley93engineering.pdf)

Tim Peters' Timsort:
[https://en.wikipedia.org/wiki/Timsort](https://en.wikipedia.org/wiki/Timsort)

[https://mail.python.org/pipermail/python-
dev/2002-July/02683...](https://mail.python.org/pipermail/python-
dev/2002-July/026837.html)

~~~
wsh91
An interesting topic, indeed. Check out this gem from Knuth, "The Dangers of
Computer Science Theory":
[https://books.google.com/books?id=QUgmsMm5LcAC&lpg=PA189&ots...](https://books.google.com/books?id=QUgmsMm5LcAC&lpg=PA189&ots=xmjDeT6VGz&dq=%22The%20dangers%20of%20computer-
science%20theory%22%20knuth&pg=PA189#v=onepage&q=%22The%20dangers%20of%20computer-
science%20theory%22%20knuth&f=false)

(Sadly, I can't find a free PDF. I've sen this in Selected Papers on the
Analysis of Algorithms--you can get a used copy from Amazon for a few
dollars.)

"Is there _any_ area (outside of numerical analysis) where mathematical theory
has actually helped computer programmers?"

Ironically enough, he mentions bubblesort's performance as better in theory
than practice for the most part. :P

~~~
hyperpape
Nice little speech, and his conclusion is much less inflammatory than your
quote suggests (it would be strange if Knuth was really against mathematical
theory).

Reading his examples, I feel as if time has been kind to mathematical theory.
_n_ has gotten a lot larger these days, and random access memory also
simplifies some analyses. That's not always true, and the gap he identifies is
still real, but maybe it's not always as important as it was.

~~~
wsh91
I wasn't trying to suggest he was inflammatory! I think you'll find the paper
as a whole is of that tone. :)

------
adrianN
I like that they put it in the WTF namespace.

Does this open Webkit up for a denial of service Javascript fragment that
triggers O(n^2) behaviour in their compiler? Then again, if you want to use up
CPU resources with Javascript a simple while loop will do...

------
omginternets
I'm not sure what I'm supposed to get from this page. Is this an allusion to
bubble-sort's O(n^2) worst-case time complexity?

~~~
bradleyjg
"In all seriousness, I just wanted a quick fix to undo the perf regression
caused by using std::stable_sort.

I filed a bug to fix this:
[https://bugs.webkit.org/show_bug.cgi?id=150843"](https://bugs.webkit.org/show_bug.cgi?id=150843")

From that link

"Bug 150843 - Consider something better than bubble sort for insertion sets
achristiansen suggested falling back on stable sort if we do too many passes.
ggaren suggested insertion sort.

There's also the possibility that we could make merge sort a lot faster, if we
didn't use system malloc as the temp buffer allocator."

Apparently a really slow merge sort implementation caused a major performance
slowdown. Since bubble sort is always taught as a bad sort it's kind of funny
that he is using it to fix a performance regression. Hence the link and
upvotes.

~~~
venning
> _Since bubble sort is always taught as a bad sort..._

I was taught about bubble sort in three different situations. In each one,
bubble sort was _not_ taught as a "bad" sort. The instructor always focused on
the difference between worst-case performance and best-case performance. It
always forced me to learn how much context matters.

Bubble sort should not be used as teaching tool for "bad algorithms" but a
teaching tool for understanding your data and how much you really know about
it.

------
devit
It might indeed be the best algorithm for common cases of this specific
problem, but there needs to be a fallback when either the input is large or a
lot of swaps are being done since otherwise it's O(n^2).

Interesting that other people pointed this out, yet he apparently committed
the code anyway.

~~~
Retric
Inserting a sorted list into another sorted list can be O(n).

~~~
thomasahle
But here they want to insert an unsorted list into a sorted list, which must
take the same time as sorting.

------
usr12345
I've heard some prominent people argue against this approach.
[https://www.youtube.com/watch?v=k4RRi_ntQc8&t=35](https://www.youtube.com/watch?v=k4RRi_ntQc8&t=35)

