

Fast Nearest Neighbor Queries in Haskell - andrus
https://izbicki.me/blog/fast-nearest-neighbor-queries-in-haskell.html

======
ximeng
Some great posts on that site. [https://izbicki.me/blog/turning-an-ak-47-into-
a-serving-ladl...](https://izbicki.me/blog/turning-an-ak-47-into-a-serving-
ladle.html) is good from the non Haskell perspective.

~~~
tdees40
Mike's one of the more fascinating people around. One of the only people ever
to receive a discharge from the navy as a conscientious objector after he
became a pacifist.

------
dfbrown
Since you're using the auto-tuned index for FLANN it's going to build a large
number of indexes to try and find the best performing one, which can take a
_long_ time. Properly tuned I would expect FLANN to be much faster (though
that's expected since it doesn't do exact searches). Also benchmarking the
index build time together with query times doesn't make sense to me since
depending on the application you may not care how long it takes to build the
index.

------
rifung
"I’d guess that it took about 10 times as much work to create my Haskell-based
cover tree than it would have taken to create a similar C-based
implementation. (I’m roughly as proficient in each language.)"

Interesting since I would imagine implementing something in C to take much
longer than doing the same in something like Python or Java. On the other
hand, I suppose getting something to work is not the same as getting it to run
fast.

~~~
bbcbasic
He says "Because Haskell code is so high level, it requires aggressive
compiler optimizations to perform well."

I think you are basically right. It is easier to make something correct in
Haskell, it is easier to make something reasonably fast in Haskell too
(generally) but in specific circumstances when you want max performance and
care about the assembly or intermediate code output then it becomes easier in
a low level language like C to do the performance optimization.

Having said that the 10* effort is an investment if you get a reusable library
and will pay off because the rest of your code is in Haskell not C!

~~~
sink
Great point. His concluding statement aligns with your observation:

"So then why did I use Haskell? To make cover trees 10 times easier for
programmers to use."

Default Haskell is certainly not geared toward numeric performance even though
it certainly can be with the right libraries.

~~~
codygman
You might be interested in this if you are interested in high performance
numerical Haskell:

[https://github.com/wellposed/numerical](https://github.com/wellposed/numerical)

Carter claims he can perform optimizations that aren't possible in other
languages, though I can't personally speak to them since I'm not well versed
in numerical code.

~~~
carterschonwald
It's not yet in shape for general use, nor for being advocated. There is some
tech there that's unique in terms of handing matrix computation on rich
formats nicely.

I'd prefer if we reserve advocacy for when i get it to the point where there's
public documentation, or at least examples :)

~~~
codygman
> I'd prefer if we reserve advocacy for when i get it to the point where
> there's public documentation, or at least examples :)

Very sorry, I'll keep quiet about it until you announce it is ready! :)

~~~
carterschonwald
thanks, i hope that happens relatively soon, though i've been a bit buried
with getting life in order, though thats finally moving along nicely.

------
amelius
This algorithm operates on trees. But I've always wondered how one can create
efficient algorithms operating on general graphs in a functional language such
as Haskell?

For example, how would one write a fast version of Dijkstra's shortest path
algorithm in Haskell?

~~~
tel
Using mutability.

It's not a sin, just something to avoid when not strictly necessary.

[http://jspha.com/posts/mutable_algorithms_in_immutable_langu...](http://jspha.com/posts/mutable_algorithms_in_immutable_languges_part_1/)

[http://jspha.com/posts/mutable_algorithms_in_immutable_langu...](http://jspha.com/posts/mutable_algorithms_in_immutable_languages_part_2/)

[http://jspha.com/posts/mutable_algorithms_in_immutable_langu...](http://jspha.com/posts/mutable_algorithms_in_immutable_languages_part_3/)

------
airza
Not that i'm the proshit haskell expert, but I don't think that GHC does the
most amazing job in applying optimizations and I definitely think its
performance paradigm is really hard to reason about.

Still, it was a nice breakdown of a real non-toy problem in haskell!

~~~
codygman
I've heard quite the opposite. I've heard that GHC is state of the art in the
depth and breadth of optimizations it can perform. I'm struggling to remember
if the context was "of everything" or just within pure functional programming.

~~~
platz
GHC has great optimizations within the scope of modules (and seems to balance
for the right level of polymorphism/modularity/linking), and can inline some
things across module boundaries. But at the level of entire programs more
advanced optimizations are possible; I gather there are tradeoffs too.

[http://mlton.org/](http://mlton.org/)

[https://www.reddit.com/r/haskell/comments/2tpmbo/what_on_ear...](https://www.reddit.com/r/haskell/comments/2tpmbo/what_on_earth_happened_to_whole_program/)

[http://stackoverflow.com/questions/4720499/possible-
optimiza...](http://stackoverflow.com/questions/4720499/possible-
optimizations-in-haskell-that-are-not-yet-implemented-in-ghc)

~~~
codygman
ML isn't that popular though right? Would it be fair to say GHC had the most
impressive optimizations of mainstream language?

With that said I certainly wonder why ML isn't more popular!

------
birdsbolt
I'm really interested to see how the author will turn Cover Trees into a
generic inference algorithm for bunch of models in HLearn library.

------
thomasahle
Does anyone know what the growth in the dimension is for cover trees? I'm
guessing it's exponential in some way?

