
A neural algorithm for similarity search (2017) - bra-ket
https://science.sciencemag.org/content/358/6364/793
======
thomasahle
For state of the art practical similarity search, the benchmarks at
[http://www.itu.dk/people/pagh/SSS/ann-
benchmarks/](http://www.itu.dk/people/pagh/SSS/ann-benchmarks/) are always up
to date with all the actually available methods and libraries.

Ibsont think any of the neural methods have shown themselves competitive yet.

~~~
erikbern
I'm one of the authors – there's actually slightly newer benchmarks at
[http://ann-benchmarks.com/](http://ann-benchmarks.com/)

Reminds me it's probably time to re-run the benchmarks and publish newer
numbers

~~~
thomasahle
Great! Reminds me: Is there any way each algorithm can keep a fixed
colour/pattern across the graphs? For example: Annoy is grey on the first
plot, but green in the second.

In either case, thanks to you Martin and Alec for the great work!

------
Mumps
I implemented this algorithm as part of an assignment for a research course
(with extension of a loss function). What the paper doesn't tell you is that
it is extremely slow, and the memory pressure is insane :/ Really neat paper
to go through, though! And not very complicated

------
moultano
If you're new to the idea of locality sensitive hashes and want to learn about
them, or would like to use one on on massive sparse data and a key-value
store, I wrote up a post that walks through all the families of minhashing
algorithms starting from the beginning (and culminates in our work extending
them to probability distributions.)
[https://moultano.wordpress.com/2018/11/08/minhashing-3kbzhsx...](https://moultano.wordpress.com/2018/11/08/minhashing-3kbzhsxyg4467-6/)

This is my favorite algorithm. It's so stupid simple, just a handful of lines
of code, and makes intractable problems tractable.

------
p1esk
If you liked this paper definitely check out Numenta’s research:
[https://numenta.com/](https://numenta.com/)

~~~
misterman0
What an awful user experience that site was. But I like their view on AI (to
understand first how the brain works, and then...). I used to be a med
student. I am so fond of their take I might even apply for a job.

------
lootsauce
I was fascinated with this paper when it came out and reached out to the
author to get access to the source. I subsequently re-implemented the python
in JS for fun and to learn how it works. This is the best post I have found
that covers in detail how the algorithm works.

[https://medium.com/@jaiyamsharma/efficient-nearest-
neighbors...](https://medium.com/@jaiyamsharma/efficient-nearest-neighbors-
inspired-by-the-fruit-fly-brain-6ef8fed416ee)

The really interesting thing from this research imho is that it is an
algorithm derived from observed neural activity as opposed to most ANNs that
are merely inspired by neural structure.

From Saket Navlakha's page
[https://snl.salk.edu/~navlakha/](https://snl.salk.edu/~navlakha/)

"We work at the interface of theoretical computer science, machine learning,
and systems biology. We primarily study "algorithms in nature", i.e., how
collections of molecules, cells, and organisms process information and solve
computational problems."

The whole range of neural nets feel like they came out of a process of random
recipes that got thrown at the wall and we all celebrate the things that
stick. It does not feel like this approach is going to lead to AGI. I am far
more interested in the "algorithms in nature" kind of research where we will
eventually (given the right tools) find some really interesting new advances
in neural net based algorithms and potentially completely new architectures.

~~~
lootsauce
my impl if anyone cares
[https://github.com/andrewluetgers/flylsh](https://github.com/andrewluetgers/flylsh)

------
sandeepeecs
We have an implementation of this type of algorithm at
[https://alpes.ai](https://alpes.ai) that powers the API. This can be an
alternative to deep neural network solutions

------
orasis
Is there a general way to convert bit strings that are similar by hamming
distance into a scalar value?

Is that what they’re talking about with having a randomly connected network
that sums its inputs and keeps the top 5%?

------
orasis
Can someone explain why sparsification/increasing the dimensionality after
random projection is helpful?

