
Timsort, the Python sorting algorithm - alexchamberlain
https://skerritt.blog/timsort-the-fastest-sorting-algorithm-youve-never-heard-of/
======
ignoramous
My favourite TimSort story is of ex-Sun employee, Joshua Bloch of _Effective
Java_ fame. J Bloch was in audience at the time when Tim Peters presented his
new algorithm to sort a list, and he was so blown away that he started porting
Tim's implementation right there with an intent to commit it to the JDK
mainline [0], which he eventually did [1].

[0] Some of the core JDK developers are really on another level. The JDK code,
post 1.5, is a joy to read, though verbose. Compilers and Languages seems to
attract a certain caliber of engineers, I think.

[1]
[https://bugs.openjdk.java.net/browse/JDK-6804124](https://bugs.openjdk.java.net/browse/JDK-6804124)

~~~
stefan_
Then, many years later, input was found that made the Java version crash:

[https://link.springer.com/chapter/10.1007/978-3-319-21690-4_...](https://link.springer.com/chapter/10.1007/978-3-319-21690-4_16)

~~~
kevingadd
I found a similar problem using a fairly battle-tested C# implementation of
timsort. Felt lucky that I had managed to reproduce it in a dev environment
instead of it being a mysterious crash for users.

~~~
lqet
Many years ago, I copied an implementation of the Cohen-Sutherland algorithm
(for line clipping) to C++ from pseudo code (I don't remember where I got this
pseudo code from, but other sources used equivalent pseudo code).

The code (it was a backend server code for a web app) was heavily tested with
ab [0] and JMeter [1] against extreme load, everything seemed to worked fine.
Fast-forward 6 months, when we had a sudden peak in users (for a few days, we
went from around 500 visitors a day to around 500.000, which roughly meant > 5
million requests per day). Suddenly, the backend, which ran without problems
for half a year, crashed in production every 5 hours or so with a segmentation
fault. I could not for the life of me reproduce this. After some panicking, I
let the backend run in gdb in production against the ~50 requests per second
we were still getting. After a few hours, the segfault occurred again, and I
figured out that on extremely rare edge cases, the pseudo code I copied to C++
divided by 0, which lead to chain of problems afterwards, eventually resulting
in said segfault. If I remember correctly, the fix was trivial.

[0]
[https://httpd.apache.org/docs/2.4/programs/ab.html](https://httpd.apache.org/docs/2.4/programs/ab.html)

[1] [https://jmeter.apache.org/](https://jmeter.apache.org/)

------
Animats
You can beat O(n log n). That limit is for sorts that use only a ">"
comparison. A distribution sort, where you distribute the keys over buckets,
can approach O(n).

The first software patent, for SyncSort, is for a sort that beats O(n log n).
The basic idea is to read records for a while, get some stats about the key
distribution, and set up the buckets to get a roughly equal fraction of the
observed keyspace. If blocks of records show up with very different stats,
action has to be taken to adjust the bucketing.

~~~
lemagedurage
Big O notation is about the theoretical upper bound of an algorithm. Clever
data-based tricks like putting items in buckets or gathering statistics are
unrelated to this notation as they rely on a certain consistency. Maybe you
should call it average run time on typical data.

~~~
nwallin
Radix sort is worst case O(n). (which is what I'm assuming the parent
commenter is referring to) It's not n in common situations, but nlogn in
pathological cases the way Timsort is.

The reason is it "violates" the n logn lower bound of comparison sorts is
because it isn't a comparison sort. Sort of like how hash table lookups
"violate" the O(log n) average lower bound of binary tree lookups. It has
different performance bounds because it's a different problem.

~~~
joshuamorton
No, radix sort is worst case O(n * k). In many common cases, k ~= log(n). IN
certain specific cases, k < log(n), and specifically for cases where you have
a very large n, but a bounded number of values (say, you're sorting 10 billion
4-bit ints), k can be considered a constant. But that is by no means generally
true.

~~~
bloomer
In most cases k << n. For 64-bit integers, byte wise radix sort k is 8, which
is less than log n whenever n is more than 256. So, radix sort is typically
much faster than an O(n log n) sort of your data support it. It just isn’t as
widely used because it is not as general as a comparison based sort.

~~~
gigatexal
Jumping in this fascinating thread about sorting to ask what the double less
than chevrons mean? I know what one means but what do two of them mean?

~~~
donbindner
In this context, it means "much less." It's used that way sometimes in
mathematical discussions.

------
tsegratis
WikiSort should even be faster still

Original author:
[https://github.com/BonzaiThePenguin/WikiSort](https://github.com/BonzaiThePenguin/WikiSort)

Graph and copyage (by me):
[https://tse.gratis/aArray/#details](https://tse.gratis/aArray/#details)

Or grail sort:
[https://github.com/Mrrl/GrailSort/blob/master/README.md](https://github.com/Mrrl/GrailSort/blob/master/README.md)

Sort, is a deep rabbit hole

~~~
tsegratis
WikiSort previous discussion
[https://news.ycombinator.com/item?id=7404223](https://news.ycombinator.com/item?id=7404223)

Can't find a direct comparison with TimSort though..

------
Beldin
I heard of it from a talk about a bug in the implementation of TimSort in
several popular libraries [1]. That bug should be fixed in Java, Android and
Python.

If you're using another language, you might want to verify that the bug is
fixed/not present in your library

[1] [http://www.envisage-project.eu/proving-android-java-and-
pyth...](http://www.envisage-project.eu/proving-android-java-and-python-
sorting-algorithm-is-broken-and-how-to-fix-it/)

------
westurner
Here are the Python 3 docs for sorting [1], in-place list.sort() [2], and
sorted() [3] (which makes a sorted copy of the references). And the Timsort
Wikipedia page [4].

[1] [https://docs.python.org/3/howto/sorting.html#sort-
stability-...](https://docs.python.org/3/howto/sorting.html#sort-stability-
and-complex-sorts)

[2]
[https://docs.python.org/3/library/stdtypes.html#list.sort](https://docs.python.org/3/library/stdtypes.html#list.sort)

[3]
[https://docs.python.org/3/library/functions.html#sorted](https://docs.python.org/3/library/functions.html#sorted)

[4]
[https://en.wikipedia.org/wiki/Timsort](https://en.wikipedia.org/wiki/Timsort)

------
ken
Besides Python and Java, it's also the sorting algorithm used by Chrome,
Android, and Swift. At this point I think more than half the world's
programmers are using Timsort, whether they realize it or not.

~~~
leshow
And Rust, the rust std lib uses a modified timsort/mergesort

~~~
cbarrick
FWIW, Rust uses Pattern-defeating quicksort for it's unstable sort.

[1] [https://github.com/orlp/pdqsort](https://github.com/orlp/pdqsort) [2]
[https://doc.rust-
lang.org/std/vec/struct.Vec.html#method.sor...](https://doc.rust-
lang.org/std/vec/struct.Vec.html#method.sort_unstable)

------
saagarjha
Note that Timsort uses O(n) extra space: sometimes this can be undesirable.

~~~
kbd
Note that this is true for any (edit: stable, nlogn) merge sort.

~~~
billforsternz
Heapsort is a lovely, simple, O(NlogN) sort that sorts in place (so that it
requires no extra space). (Explicitly stating something that is implied by an
existing response).

Edit: Whoops, apparently heapsort is not "stable" (not sure what that means
actually), sorry.

~~~
cdirkx
A "stable" sorting algorithm preserves the relative order of elements with
"equal value". This doesn't really apply if you are only sorting simple
values, as there is no difference between a 7 and another 7, but does if you
are sorted more complicated objects by some key.

Example: sorting 2#a, 1#c, 2#b by only the first number.

An algorithm that produces 1#c, 2#b, 2#a is a correct sorting algorithm, but
not stable as it changes the order of 2#a and 2#b.

~~~
billforsternz
Thanks for the explanation, interesting.

------
pwinnski
I think most Python devs have heard of this, as well as avid followers of HN,
of course.

------
CogitoCogito
Here is a video I made a while back to better visualize the runs of timsort:

[https://www.youtube.com/watch?v=ZxLxf5xqqyE](https://www.youtube.com/watch?v=ZxLxf5xqqyE)

Might be of interest...

------
pitaj
Rust uses timsort as the stable sorting algorithm in its standard library.

[https://doc.rust-
lang.org/std/primitive.slice.html#method.so...](https://doc.rust-
lang.org/std/primitive.slice.html#method.sort)

------
slivanes
For an audio representation of TimSort:
[https://www.youtube.com/watch?v=xoR-1KwQh2k&t=274s](https://www.youtube.com/watch?v=xoR-1KwQh2k&t=274s)

~~~
war1025
I watched probably ten minutes of that video. Don't know that I learned
anything, but it was sort interesting.

------
yagibear
Never heard of since it was last discussed on HN:
[https://news.ycombinator.com/item?id=17436591](https://news.ycombinator.com/item?id=17436591)

~~~
dang
And
[https://news.ycombinator.com/item?id=17883461](https://news.ycombinator.com/item?id=17883461)
after that.

Quite a few submissions for something no one has heard of:
[https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...](https://hn.algolia.com/?dateRange=all&page=0&prefix=true&query=Timsort&sort=byDate&type=story)

The other main discussions are 2011:
[https://news.ycombinator.com/item?id=3214527](https://news.ycombinator.com/item?id=3214527)

2009:
[https://news.ycombinator.com/item?id=752677](https://news.ycombinator.com/item?id=752677)

------
fghorow
Tim Peters is also the IEEE-754 guru for the Python codebase.

Truly a giant of the Python community.

<1/2 wink-ly yr's>

------
Congeec
There are cases where pattern defeating sort beats timsort

[https://github.com/EmuraDaisuke/SortingAlgorithm.HayateShiki](https://github.com/EmuraDaisuke/SortingAlgorithm.HayateShiki)

~~~
purplezooey
"external area" tho

------
timothycrosley
As an (unrelated to the Algorithm) Python developer, also named Tim, I've
never been able to hear the last of it. Still a great algorithm, though!

------
social_quotient
I always thought this sort visualizer was cool
[http://sorting.at/](http://sorting.at/)

------
james_s_tayler
Heard about it by reading Java stacktraces one day.

------
sclangdon
Isn't this the same idea as the Introsort[1], which was created in 1997 (four
years earlier) and is the default sorting algorithm in C++'s standard library
(and .NET framework since 4.5)?

[1]
[https://en.wikipedia.org/wiki/Introsort](https://en.wikipedia.org/wiki/Introsort)

------
alexnewman
That's why I encode all my data in a format optimized for galloping mode. Just
kidding. That'd be sweet though

------
CJefferson
I also love pdqsort. It has many of the same advantages, and has the advantage
of being easier to implement (particularly if you already have a quicksort
lying around).

[https://github.com/orlp/pdqsort](https://github.com/orlp/pdqsort)

Basically, timsort is to mergesort as pdqsort is to quicksort.

------
longemen3000
other sorting algorithm, taking advantage of SIMD instructions and being
cache-aware, chipsort:
[https://github.com/nlw0/ChipSort.jl](https://github.com/nlw0/ChipSort.jl)

------
justAlittleCom
I dont understand how it is so special. Every single realworld sort I have
seen are hybrid sorts. Very often merge sort + insertion sort or bubble sort
for small sub-array.

Can you englight me ?

------
purplezooey
It's an awesome idea, but seems to cross the line into "exploiting specific
properties of the input", even if just a little, which would impact it's
usefulness in say, a general purpose library.

------
banachtarski
Thanks for editing what was originally an inane title "Timsort the sort you
haven't heard of" to describe something I (and probably most here) have heard
of many times over and over again.

------
person_of_color
How to be THIS good as a Software Engineer?

~~~
kragen
Practice, challenge yourself constantly, seek mentorship, humbly request
feedback and act on it, and devote your life to the craft, not your wallet or
your family. I think. Read Hamming.

~~~
purplezooey
Easier said than done

~~~
sanjayts
If it was easy, folks wouldn't have to ask how it could be done and everyone
would that THAT good. ;)

------
xiaodai
radixsort is better still

~~~
isatty
Only for numbers.

------
mharrison
Tim the enchanter!

------
boltzmannbrain
"never heard of"?? I would hope all Python devs at some point Google "What
algorithm is Python's built-in sort function?"...

[https://docs.python.org/2/howto/sorting.html#sort-
stability-...](https://docs.python.org/2/howto/sorting.html#sort-stability-
and-complex-sorts)

[https://stackoverflow.com/questions/10948920/what-
algorithm-...](https://stackoverflow.com/questions/10948920/what-algorithm-
does-pythons-sorted-use)

~~~
quickthrower2
Great, yet another True-Scotsman of being a proper developer. There is so much
you should have read, googled, written to be a "True" dev these days. My
theory is - if you are often learning stuff and producing good working code be
happy. Not every developer needs to know the underlying sort algorithms.

~~~
dkersten
The person never said that you had to google it to be a "proper developer",
just that they hoped that every python developer would have done so.

~~~
paulddraper
I hate to disappoint that hope but I have never search for Python's sorting
algorithm.

~~~
Ultimatt
What about the hash strategy used in dicts? I'm holding on to hope.

------
unnouinceput
I, for one, cannot wait that quantum computing to be the norm, like current
one is. Then the only algorithm everyone will use will be randomsort. Got a
list to sort it? Allocate one q-bit for each element and apply randomsort and
boom, done in under a picosecond, regardless of list size. This is the
ultimate algorithm to be implemented for parallelization, all others will take
more since they depend on sequence input.

~~~
zitterbewegung
That’s not how it works . Grovers algorithm takes O(sqrt(n)) using a quantum
computer .

Values in a quantum computer are superpositions but when measured they will
return only one result. They don’t have an infinite amount of time and or
space .

~~~
unnouinceput
We are today on quantum computing where we were in 18th century when Ada was
creating the 1st computer program. Sure, Grover's algorithm is good for
current state of quantum computing but when it will reach to be a current norm
just like the Silicon based one is today then is entire state altogether. By
that time Grover's algorithm will take it's place in history but will not be
used in practice. From wiki: Perform the following "Grover iteration" r(N)
times. Iteration? That means one state of the computation is waiting for a
previous one to be completed. That, to me, sounds like not a true
parallelization algorithm, hence randomsort still wins.

~~~
saagarjha
I suggest you read up on how superposition works in quantum computers–they not
just computers with an infinite number of cores :/

~~~
unnouinceput
No, they are not. Also 128 billion is a big enough number that if you'd say to
a 18th century scientist who was working with punch cards computers that in
the future one with 128 billion punch cards would exists he would've replied
something along the lines, just like you, that there are not enough trees on
Earth to create 128 billion cards for a single computer, let alone billion of
them that they are more common than horses in 18th century. And yet, here we
are, with computers that have 128 GB RAM, like the one I use to reply to you.
So how about you let me dream big instead, eh?

~~~
nullc
If you are going to come up with purely conjectural advantages with no basis
in known science, why bother attributing them to one thing (like quantum
computing) and not any other thing?

Like why are you not arguing that C++24 will make all algorithms O(1)? There
is as much reason to believe that as there is to believe that quantum
computing will do so.

Or maybe graphene memresistors will form timelike loops and allow computers
that give you results before you ask the question? Maybe! Again, no more
reason to think that won't happen than some kind of purely conjectural magic
from quantum computation.

Maybe the development of green energy such as solar panels will lead to the
development of nano-scale self assemblers, allow us to turn the entire moon
into ultra efficient computronium powered by sinking the moon's residual heat
of formation into deep space, making the asymptotic complexity of most
algorithms on many problems largely irrelevant. Maybe! Or maybe the insight
might actually come from a school kid that trips over a rock and gets a vision
after hitting their head. Better start putting pebbles out in front of
schools, because you never know! :)

Isn't getting a sqrt() speedup for everything and an exponential speed up for
a few things good enough for you?

If we really understand the idea of quantum computing so poorly that that
we've massively underestimated it, isn't it even more likely that our
misunderstanding has made us massively overestimate it?

