
Damn Cool Algorithms, Part 1: BK-Trees - llambda
http://blog.notdot.net/2007/4/Damn-Cool-Algorithms-Part-1-BK-Trees
======
grandpa
Some implementations:

Lisp: <http://cliki.net/bk-tree>

Python: <http://hupp.org/adam/weblog/?p=103>

Ruby: <https://github.com/threedaymonk/bktree>

Haskell: <http://www.robdickerson.net/?p=346>

Java: <http://code.google.com/p/java-bk-tree/>

~~~
zszugyi
IMHO the explanation on the Lisp implementation page is much clearer then the
article.

~~~
omaranto
What an odd thing to say! The Lisp implementation page has no explanation that
I could find of either how to build the tree or how to search it. Or did you
mean you found the source code to be a clearer explanation?

Edit: The lisp page does have a nice picture of a tree. Did you mean that
looking at that picture you could figure out both the algorithm for building
the tree and for querying it and the proof that these are correct faster than
by reading Nick's blog post?

~~~
zszugyi
I meant that the picture explains the tree's structure much clearer than the
blog post, which is probably not surprising. Didn't really look at how to
build or query the tree.

------
wgd
This seems similar to the VP-Tree, which there was a post about a few days
ago, except that VP-Trees use binary partitioning because they're measuring
similarity in a continuous space, while these ones use discrete buckets
because similarity is measured discretely.

~~~
gujk
The VP article linked to this one as background material. That is where OP
found this article to repost for karma.

------
leeoniya
i especially like the part where it says. "The only copy of this online seems
to be in the ACM archive, which is subscription only."

~~~
kjhughes
To clarify, Nick Johnson is referring to the original paper on the topic
("Some approaches to best match file searching" by Burkhard and Keller in
1973). It's a common frustration that ACM archives are not freely available to
the public.

In the comments, "Sarah" notes that the paper can be found here:

[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.160...](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.160.2510&rep=rep1&type=pdf)

BTW, Nick's Blog often has gems and is well worth a regular read.

------
arnoldoMuller
The SISAP metric spaces library comes with an implementation:
<http://sisap.org/Metric_Space_Library.html> You can also compare BK-trees
with other methods and you may encounter that their performance is not so
great, specially for high dimensional spaces.

