
An O(N) Sorting Algorithm: Machine Learning Sort - jonbaer
https://arxiv.org/abs/1805.04272v2
======
bvinc
The last sentence before the conclusion is

> We have to admit that, if there are too many numbers that fall into the tail
> intervals, the Machine Learning Sort might fail.

It's interesting to have a sorting algorithm with a worst case time complexity
of "might fail".

~~~
im3w1l
It's not great wording. They mean fail to run in O(n), not crash or give
incorrect answer.

------
davesque
So doesn't this mean supervised learning sort? In other words, restricted
domain sort? In that case, I don't think O(n) should be surprising. Radix
sort, for example, is also O(n) for restricted input.

------
nullbyte
This work looks promising, but it's important to note that the authors didnt
prove worst case time complexity.

This algorithm can get much, much slower than O(n)

------
im3w1l
tldr: Assume input that are real numbers with some distribution. Approximate
distribution with ML and use it to predict where every element should go.
Using prediction monotonous in value means no inversions but possible
collisions. Fix that up into a sorted array.

The authors never proved that this will give O(n) in the presence of not-
perfect estimation of the distribution, but their numerical experiments look
promising.

------
jsnell
Earlier discussion:
[https://news.ycombinator.com/item?id=17064170](https://news.ycombinator.com/item?id=17064170)

------
jl2718
Invert distribution. Radix sort.

Where’s the ML?

~~~
jl2718
Sorry, this was a jerk comment. I was probably annoyed by the tortuous
exposition. Please eliminate the first 3 paragraphs.

Real question: "there is no lower bound for M like log(N)"

I would challenge this statement because the number of neurons is like the
number of gaussian mixture components, which limits resolution of the radix
buckets. In order for it to work, the divergence between real and modeled
distributions must meet an upper bound which decreases as the inverse of N,
and you might say that the divergence lower bound decreases as the inverse of
M (although not sure). So given any distribution that cannot be modeled with a
gaussian mixture, M must scale with N.

