Mergesort was improved by Tim Peters in 2002 and that became timsort. He invented a way to take advantage of pre-sorted intervals in arrays to speed up sorting. It's basically an additional layer over mergesort with a few other low-level tricks to minimize the amount memcpying.
Quicksort was improved by David Musser in 1997 when he developed introsort. He set a strict worst-case bound of O(n log n) on the algorithm, as well as improved the pivot selection strategy. And people are inventing new ways of pivot selection all the time. E.g. Andrei Alexandrescu has published a new method in 2017.
In 2016 Edelkamp and Weiß found a way to eliminate branch mispredictions during the partitioning phase in quicksort/introsort. This is a vast improvement. The same year Orson Peters adopted this technique and developed pattern-defeating quicksort. He also figured out multiple ways to take advantage of partially sorted arrays.
Sorting is a mostly "solved" problem in theory, but as new hardware emerges different aspects of implementations become more or less important (cache, memory, branch prediction) and then we figure out new tricks to take advantage of modern hardware. And finally, multicore became a thing fairly recently so there's a push to explore sorting in yet another direction...
Often a linear search of a "dumb" array can be the fastest way to accomplish something because it is very amenable to pre-fetching (it is obvious to the pre-fetcher what address will be needed next). Even a large array may fit entirely in L2 or L3. For small data structures arrays are almost always a win; in some cases even hashing is slower than a brute-force search of an array!
A good middle ground can be a binary tree with a bit less than an L1's worth of entries in an array stored at each node. The binary tree lets you skip around the array quickly while the CPU can zip through the elements at each node.
It is more important than ever to test your assumptions. Once you've done the Big-O analysis to eliminate exponential algorithms and other basic optimizations you need to analyze the actual on-chip performance, including cache behavior and branch prediction.
My favorite example is adding and ordered list of items into a a simple tree, all you've really done is created a linked list. Big-O doesn't know what your data looks like but you generally should.
Unless you know what you know your distributions and are generally proficient in probability theory (in 99% of the cases, neither can be relied on) the only relevant big-O metric is the worst case one
>Sorting is a mostly "solved" problem in theory, but as new hardware emerges different aspects of implementations become more or less important (cache, memory, branch prediction)
This makes me wonder what other hardware tricks might be used for other popular algorithms such as ones used in graphs. I'm sure shortest path is also one of those algorithms that have been "solved" in theory but have a huge amount of research, but personally, what would be more interesting to hear about is something that isn't quite as easy. Something like linear programming with integer constraints or even something like vehicle routing or scheduling.
To anyone studying those areas, is there anything you find particularly interesting?
Quick sort (unstable, n^2 worst case, in place, heapsort (unstable, n log n worst case, in place) and merge sort (stable, n log n worst case, not in place)
There are variants of each that trade one thing for another (in placeness for stability, constants for worst case), but these are the three efficient comparison sort archetypes.
Of these, quicksort and heap sort can do top-k which is often useful; and heapsort alone can do streaming top-k.