

Hardening Perl's Hash Function - rhu86
http://blog.booking.com/hardening-perls-hash-function.html

======
TheLoneWolfling
> An attacker could precalculate one or more suffixes such that H(x) == H(
> concat(x, suffix) ) which would then allow an attacker to trivially
> construct an infinite set of keys which would always collide into the same
> bucket. We hardened the hash by mixing in the length of the key into the
> seed.

This prevents making an infinite set of keys that collide, yes, but does not
fix the problem completely. It only increases the request size required to
perform the attack. But even with 8 character UTF-8 keys, you could generate
2^64 unique keys with the same length.

Note: IANAC (I am not a cryptographer.) But if I was looking for guarantees
for worst-case performance, I'd (pretty much) never use a HashMap. If nothing
else, because of the array resizing.

~~~
ggchappell
> But if I was looking for guarantees for worst-case performance, I'd (pretty
> much) never use a HashMap.

I wonder why none[1] of the major dynamic languages offer the option of
running with the built-in associative-data structures implemented using a Red-
Black Tree. Seems like an obvious thing to do.

There are interface issues, of course. In particular, for a new type to be
usable as a key, it would need to provide an ordering rather than a hash
function.

But if only built-in types are used as keys, then there might be no reason to
change the interface at all. Just re-run the same program with a different
option and you've traded some average-case performance for a guarantee on
worst-case performance. And the Red-Black Tree is already written (somewhere
out there ...). So why not?

[1] To my knowledge.

~~~
_delirium
_> I wonder why none[1] of the major dynamic languages offer the option of
running with the built-in associative-data structures implemented using a Red-
Black Tree. Seems like an obvious thing to do._

It's not a command-line option, but Perl lets you "tie" a hash to a different
backing implementation, and there are some tree-based ones:

    
    
        use Tree::RB;
        tie my %hash, 'Tree::RB';

~~~
ggchappell
Wow, I had no idea. That's a nice feature.

------
tiagobraw
Thats why I like Lua's hash implementation that adds a random seed to the
hashing algorithm to limit those pathological side effects...

~~~
Pxtl
Wait, does that mean that iterating across the members of a Lua table is not
in deterministic order? I assume it just uses the naive hash ordering for
iterators... this is relevant because Lua is used as a scripting-engine
SpringRTS which uses a lock-step-based multiplayer approach so anything non-
deterministic risks breaking sync.

~~~
finnw
> Wait, does that mean that iterating across the members of a Lua table is not
> in deterministic order?

Yes. And it is documented as such. Any dependency on repeatable iteration
order is a bug.

~~~
Pxtl
Huh. I'm not involved directly in SpringRTS, but I'm going to take a look at
the code... I assume there's a compilation flag for Lua to disable that, or
they've hacked it up and are using a custom interpreter, because protecting
sync is paramount.

~~~
avar
Just using "sort(hash-keys())" instead of "hash-keys()" should be a trivial
patch that fixes this.

