His prescription of a new high-performance hash function based on mul32/xor/shift is also exactly correct in my opinion. Not only does it allow you to scale the hash function down to limited microarchitectures but it also allows you to scale it up to the largest/fastest vector engines in most cases for extreme throughput on higher end hardware.
Coincidentally, I also started working on a new set of hash functions based on the mul32/xor/shift motifs precisely for the purpose of getting something that could be efficiently vectorized for performance while still able to run on, say, a 32-bit ARM core.
> Coincidentally, I also started working on a new set of hash functions based on the mul32/xor/shift motifs precisely for the purpose of getting something that could be efficiently vectorized for performance while still able to run on, say, a 32-bit ARM core.
That's cool to hear. I'm happy to reimplement hash functions to fit my bizarre needs, but as per the post I really don't want to design them :)
What's your view on how switching from rotates to shift-xors affects quality? Is it mostly neutral as long as the constants are chosen correctly, or is there some quality loss?
I have no hard evidence but my intuition is that the loss of quality by moving from rotate to shift is below the noise floor of what is required to patch up the bit value bias introduced by multiply.
We did the similar trick with SSE. It's a good approach if a kind of non-portable hash can be used. In system I'm working on we introduced two APIs: hash() which maps to "fastest hash we have on this machine" and portable_hash() that's a predefined hash algorithm.
BTW, SSE4 introduced carry-less multiplication which can also be used to boost hashing. However, it's good for large data; in our tests it outperformed other algorithms for inputs larger than 1kB.
The Makefile at https://github.com/tromp/cuckoo/blob/master/src/Makefile has targets cuckoo28stx, cuckoo28stx4, cuckoo28stx8 and cuckoo28stx16 for these different levels of parallellization.
Boy, I really hope you're wrong! Any further thoughts?