
Burstsort : the fastest known in-memory string sorting algorithm - fogus
http://goanna.cs.rmit.edu.au/~jz/fulltext/alenex03.pdf
======
alex_stoddard
This looks like a relatively old paper (2002) and so I think it is measuring
performance on strictly single processor machines. On the other hand it is
trie based which suggests that it could also be highly parallelizable on
modern multi-processor systems. (I am thinking clojure data structures here
and maybe the submitter was too... <http://fogus.me>).

Then again maybe datasets of the size tested (or much bigger) are now just
broken down in a map-reduce fashion over multiple machines, and then who cares
about the efficiencies of cache misses etc?

Can anyone comment on if burstsort or trie based sorting algorithms in general
have been adopted? (All my experience has been as a consumer of library
provided "sort" methods rather than ever looking at current state of the art
algorithms).

~~~
ynniv
Probably not by the web crowd, but I have heard of them used in big data
situations. I don't know whether cache-miss is still a primary factor, as
really large data-sets simply don't fit in memory without a trie. Any scalable
type-ahead search uses one - type something into Google's search box and
imagine the size of the search space. Additionally, the completions displayed
are sorted by frequency, so imagine keeping that up to date as people enter
more queries.

------
p_coster
A parallel version of burstsort is the topic of my honours project due in a
few weeks. It's WAY faster. :P

------
warfangle
As it uses tries, a supporting library (trie datastructure implementation) in
Scala, as released open-source by Twitter:

<http://github.com/pankajgupta/Trie>

------
jeff18
FYI, it looks like the title was changed to [scribd] but the file is still a
PDF.

~~~
chrisa
If you click on the [scribd] part of the link, you get the scribd version. If
you click on the other part of the title, you get the original pdf version.
More info recently here: <http://news.ycombinator.com/item?id=1124940>

~~~
jeff18
Whoa, crazy. It might be cool to make the scribd portion slightly smaller or
somehow distinguished from the rest of the title.

