Hacker News new | past | comments | ask | show | jobs | submit login

> First, from any hash only a few bits are use for a hash table lookup, typically between 5-8.

This is not my experience. Frequently, hash tables will have a size that is a power of 2, and the address into the table will be all of the low order bits. A table with 1 million slots will use 20 bits of hash.

> Second, against DDOS only uhash, or using no linked lists or arrays with O(n/2) lookup costs help.

uhash is just one example of hash function designs based on almost-universal hashing. There are many others, including VHASH and Poly1305. Is HighwayHash almost universal?




> Frequently, hash tables will have a size that is a power of 2 ...

It looked like that was what rurban was arguing? Pull out the low 5-8 bits, get a table of size 32-256?


Exactly. You attack an existing table in an application via POST request of colliding keys. This table has mostly 10-100 elements, and you need to find a collision of say ~10.000-20.000 elements to have an effect. So it starts with ~5 bits on the avg. case and gets into 13-14 bits. This is all too trivial and doesn't need a "secure" hash function, which is not secure at all and can be precomputed. It's plain security theatre by rookies. The "security" may come from the collision resolution or from the seed if you insist on cache-unfriendly linked lists or better strategies, but never from a hash function per se. With 5-14 bits you have no chance at any "security", which starts at 256 bits.

And on top of that I guess nobody ever heard of "Cache-Conscious Collision Resolution in String Hash Tables" by Nikolas Askitis and Justin Zobel, who measured that a linear search in an array for collisions outperforms a linked lists by 2-4x, even if that can do move-to-front much easier.

People still do worry about the hash function, where they do should worry about their hash tables instead, which can easily be made 4x faster and not 2x slower.


> This is all too trivial and doesn't need a "secure" hash function, which is not secure at all and can be precomputed. It's plain security theatre by rookies. The "security" may come from the collision resolution or from the seed

Hash "functions" like UHASH, Poly1305, and SipHash are actually families of hash functions, parameterized by seed.

> And on top of that I guess nobody ever heard of "Cache-Conscious Collision Resolution in String Hash Tables" by Nikolas Askitis and Justin Zobel, who measured that a linear search in an array for collisions outperforms a linked lists by 2-4x, even if that can do move-to-front much easier.

If an implementation is vulnerable to hash-flooding attacks because it doesn't have a randomly-seeded almost-universal hash function, then changing the collision mechanism from linked lists to arrays isn't going to fix the problem.

Good collision resolution and good hash functions are both issues worth addressing.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: