
Crazily fast hashing with carry-less multiplications - robinhouston
http://lemire.me/blog/2015/10/26/crazily-fast-hashing-with-carry-less-multiplications/
======
rurban
Oh come one. Comparing CLHash against CityHash and SipHash only? CityHash was
replaced by FarmHash some year ago, and only python uses SipHash with its
outrageously bad performance.

The fastest and mostly-secure 64bit hash functions are Metro, Spooky, xxHash64
and Multilinear-HM by the same author. See "Strongly universal string hashing
is fast", Daniel Lemire and Owen Kaser, 2014.

CLHash doesn't even pass the avalanche test from smhasher
[https://github.com/rurban/smhasher](https://github.com/rurban/smhasher) I'll
add it there for better comparison, with the new leader falkhash:
[https://github.com/gamozolabs/falkhash](https://github.com/gamozolabs/falkhash)

------
espes
Amusingly, this complete hack using the aesenc instruction claims 0.08 cycles
per byte on haswell:
[https://github.com/gamozolabs/falkhash](https://github.com/gamozolabs/falkhash)

~~~
runholm
Interesting how horrible the result may be when you use TDD for hash algorithm
development.

------
aconz2
Lemire's work always pops up in regards to high performance or data
compression. The paper is interesting and I learned about a new field of math.
I especially admire the FOSS implementation of the theory, which is often
missing from many academic papers.

Makes me think there's a lot more improvements to be had with the much larger
and specialized instruction sets we're seeing nowadays.

------
rw
How does it do under SMHasher (a library to test hash quality)?

~~~
twotwotwo
The paper
([http://arxiv.org/pdf/1503.03465v7.pdf](http://arxiv.org/pdf/1503.03465v7.pdf))
goes into that section 6.1, which Lemire mentions in a comment on the post--
summarizing as best I can:

CLHash, the hash the post is focused on, passes all except the avalanche test
for strings <8 bytes, and could be made to pass it by adding a bit-mixing step
to the end (that provably doesn't make random collisions more likely, since
it's an invertible transform on the hash).

------
aappleby
Interesting work, I'll have to read through the universality proofs more
later. :)

