
XXH3, world's fastest hash algorithm, has reached stable status - telendram
https://github.com/Cyan4973/xxHash/wiki/Performance-comparison
======
segfaultbuserr
Note to readers: This is a non-crypto hash algorithm. It's not meant to be
cryptographically secure, but collision is rare enough for databases and
similar applications, and is really fast. A real crypto hash like SipHash or
BLAKE is already considered lightning fast (1-3 GB/s) in crypto applications,
but XXH is faster than a RAM sequential read (30+ GB/s).

~~~
MR4D
So, does this mean I can replace CRC32 with XXH ?

~~~
ncmncm
Depends what you use CRC for. CRC is processed a byte at a time, but really
fast things nowadays do 16 or 32 bytes at a time.

------
feniv
If you’re interested in diving deeper, the SMhasher benchmark suite tests a
much more exhaustive list of hash functions (including wyhash) -
[https://github.com/rurban/smhasher](https://github.com/rurban/smhasher)

[https://www.strchr.com/hash_functions](https://www.strchr.com/hash_functions)
Is also a great write up of how some of the classic hashing algorithms perform
of various data sets.

[https://nullprogram.com/blog/2018/07/31/](https://nullprogram.com/blog/2018/07/31/)
is an interesting approach on generating hashing functions automatically.

~~~
rurban
Note that latest wyhash 1.0 is still not included into my benchmarks, it's in
the wyhash branch. My benchmarking machine is still busy with fuzzing
unfortunately.

As for the speed: I wouldn't claim XXH3 the world's fastest hash function at
all. It's very fast for bulk digests, of large buffers. Amongst other similar
hash functions. But as we know hashing mostly involves short strings, and
there code size matters most. Wordwise FNV1, crc-ni and wyhash would be the
fastest IMHO.

------
tpetry
So if you are confused like me how someone came up with an algorithm faster
than anything else currently available: This is from the mastermind behind the
lz4 and zstd compression.

------
Akababa
Noob question: how do they even measure performance faster than RAM sequential
read?

~~~
_b
By hashing data already in the processor's cache.

------
MuffinFlavored
faster than RAM sequential read...

i wonder if anything above sse2 would offer even more performance
improvements?

~~~
staticfloat
If you look at the second table below the first, an AVX2 implementation of
XXH3 gets just under a 2x speed boost.

~~~
smabie
What about AVX512?

~~~
ncmncm
Probably not: Diminishing returns. Anyway existing impls serialize two 256-bit
ops.

------
Const-me
I wonder how it compares to AES-NI. Technically that's not a hash, but it can
be abused to be, and due to the hardware support the bandwidth is pretty good.

~~~
telendram
AES-NI can be used as a hash, but don't assume that it makes a good job at it.
Latest version of MeowHash is fine, but all other variants and prior versions
have shitty collision ratios, equivalent to a ~50 bits hash :
[https://github.com/Cyan4973/xxHash/wiki/Collision-ratio-
comp...](https://github.com/Cyan4973/xxHash/wiki/Collision-ratio-
comparison#testing-64-bit-non-portable-hashes-)

~~~
eggsome
Is there a 2 axis graph that has both collision ratios and speed?

