

Cuckoo hashing - adrianN
http://zufallstee.blogspot.com/2011/10/hashing.html

======
padenot
For those interested in concurrent hashing, this paper [1] shows a remarkable
implementation of an hashmap that performs way better that the one presented
in the article.

[1] : <http://www.springerlink.com/content/u710121187m65436/>

~~~
ErikCorry
Also described here: <http://en.wikipedia.org/wiki/Hopscotch_hashing>

------
yread
> But we’ve shown that these types of subgraphs don’t appear with high
> probability! Hence the greedy routine works for hash tables with an
> occupancy of less than 0.5.

I haven't completely understood the probability estimation, but doesn't the
insertion algorithm have an infinite worst case running time (albeit not with
"high probability")?

~~~
jaekwon
i guess, without loop detection.

~~~
ErikCorry
Yes, from the original paper: "As it may happen that this process loops, the
number of iterations is bounded by a value, MaxLoop, to be speci ed below. If
this number of iterations is reached, everything is rehashed with new hash
functions, and we try once again to accommodate the nestless key"

------
thyrsus
If the original hash function is f(x), are there reasons the "second" hash
function shouldn't be (f(x)+1)mod n (n being the number of hash slots)? I'm
sure that some distribution of x would generate pathological behavior, but my
question is whether it breaks the reasoning about the probabilities.

I'm trying to convince myself that a "second" hash function might be a
reasonable cost even for modest n (e.g., when, as in perl, you use hashes to
hold the attributes of an object, and the number of attributes will rarely
exceed 50).

~~~
andylei
you want two items that have the same result from the first hash function to
have different results from the second hash function, with relatively high
probability.

if f1(x0) = f1(x1), you don't want f2(x0) = f2(x1) to happen very much.

------
xtacy
I am wondering if the benefits of Cuckoo hashing are due to the power of 2
random choices:
[http://www.eecs.harvard.edu/~michaelm/postscripts/handbook20...](http://www.eecs.harvard.edu/~michaelm/postscripts/handbook2001.pdf)

~~~
adrianN
Yes, some of it. You get a very low collision rate due to that.

