Hacker News new | past | comments | ask | show | jobs | submit login
Wolfsort: An ultra-fast hybrid radix sort algorithm (github.com/scandum)
102 points by scandum on July 25, 2020 | hide | past | favorite | 27 comments



> Historically software was always low on memory, and subsequently sorting algorithms were developed that bent over backwards in order to use as little additional memory as possible. Nowadays most modern systems have several gigabytes of memory that are not used and are just sitting idle.

True, but on these systems, most of the time, users don't care about performance. And when they do, the dataset is not large enough for this kind of algorithms to shine anyway.


Also, memory is way slower than the CPU cache(s). So operating within a small subset of the available memory is still beneficial if you need the top performance.


There often is only a narrow range where (data + overhead) is larger than cache, but (data + 0) is smaller than cache.

Cached reads/writes arent't free either, so if overhead drastically reduces those, you may come out on top 100% of the time.


Who doesn't care about performance? If your browser took 1 minute to start up you'd uninstall it.


> Wolfsort requires 8 times the array size in swap memory for the O(n) partitioning process.

> Wolfsort is primarily a proof of concept, currently it only properly supports unsigned 32 bit integers.

A lot of sorts can beat a generic sort on an array of numbers by gathering statistics, paying better attention to locality, implementation details like use of SIMD or some combination of those three.


Indeed, I wonder if someone can't write an ensemble algo of sorts that gets chosen based on some AI (not to hand wave)


See Hybrid Sorting algorithms. Here's Introsort as an example:

https://en.wikipedia.org/wiki/Introsort


All of these algos have the characteristic that a human is statically deciding on the algo/thresholds. I think there is an opportunity for some AI here, that's all I'm saying.


Even bubblesort would return in the time it would take to send the array to be sorted to the GPU


Mistaken understanding of how GPUs are used in AI.

GPUs can be used for training but there is no requirement for the model to be executed on GPU. For example, a network can simply be created for incrementing a counter which could be what a "choose the best sort" network ends up doing.


I assume the idea is to do some profiling and throw in some machine learning to select the perfect mix at compile time.


This is doable, but the risk is that looking at the numbers might take longer than just sorting them, even with a sub optimal algorithm.

I guess someone will just have to try this.


Here is what a human did:

It begins with quicksort, it switches to heapsort when the recursion depth exceeds a level based on (the logarithm of) the number of elements being sorted and it switches to insertion sort when the number of elements is below some threshold.

I think machines can do this too.


If you need to sort bit and very often, then I imagine profiling it with your load and optimizing automagically every night could bring benefits.


A bit OT: I very much appreciate well written C code for its simplicity and "rawness", but how come that many C codebases seem to place no importance on code comments?

This repo contains a few thousand lines of code without so much as a single comment. In my experience that tends to happen much less in other languages. And it's not like this C code is obvious.


C has an interesting culture of terse variable and function names, and limited comments. It comes from an entirely different mindset and time period in software. It predates most "modern" concepts of software re-usability that eventually turned into enterprise OOP practices and was crystalized in Java and C#.

It's not great anymore. The software world has moved on to understanding how important these things are (and even often takes things too far), but C seems a bit stuck in its ways.


The value of comments and references depends on the software context. If inexperienced people are expected to be editing or reading the code they're more valuable. However, they are expensive to keep in sync with changes to the implementation and the value has to balanced against this cost.

Java and C# did not replace C. They're typically used for different tasks.

Objectively speaking, do we have evidence these enterprise OOP practices lead to better software?


Comments increase compile time!


Not to mention file sizes!


Typing time as well!


It seems to me that it's using a strategy similar to SpreadSort, or am I wrong?

https://en.wikipedia.org/wiki/Spreadsort


Similar strategy. Wolfsort is stable however while SpreadSort does not appear to be so.

I wonder if wolfsort is stable and faster because it utilizes quadsort?

https://github.com/scandum/quadsort


It does, according to the Wolfsort's readme


> Wolfsort is primarily a proof of concept, currently it only properly supports unsigned 32 bit integers.

Which makes it pretty useless outside of theoretical exercises(that is not saying that theoretical exercises would be useless). What would be a practical use case to sort integers? Well, the integers can be keys to some kind of records, but if your records remain unsorted accessing them is very expensive even if you have the sorted list of keys available.

Edit: Calculating the median is a practical case. Not sure how often that is needed in performance-critical contexts.


All radix sorts only work for integers. This isn't unique to OP's algorithm.


This is true in only the strictest sense, as computers only deal with integers:

Any set of tuples/sequences can be lexicographically radix-sorted if its components can be; strings are a special case, where the components are the individual character codes.

Floating point numbers are integers with the high-order bits used for the exponent. When normalized, they can be treated as integers for sorting.

All radix sort requires is a recursively partitionable domain, and it performs optimally when the partitioning scheme matches the data distribution.


Of course if you want the median you do so in linear time with quickselect/k-select.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: