... but this was also one of Tim Newsham's (many) contributions to our joint research project on IDS in '97:
Glad to see it getting more attention. It seems to pop up every couple years. (This might be the first generalization of the technique to arbitrary web requests, which, yeah, will probably get a lot more attention).
The great part is that, as long as you convert request params into hashes (a reasonable step, I'd imagine the vast majority of web frameworks do this), you're screwed if your language is vulnerable to this. Of course, you can tweak your framework to not use native hashes (or detect such attacks and handle them separately), but that's not generally an option as it's a pretty involved operation.
I wonder: would a hashtable that uses cuckoo hashes be vulnerable? Technically, sure, but would it make the attack infeasible due to the difficulty in creating double-collisions? Or is that not as hard as I'm thinking it could be?
Known and also fixed because since then Perl has been using a random hash function.
ref: http://perldoc.perl.org/perlsec.html#Algorithmic-Complexity-... (http://news.ycombinator.com/item?id=3402187)
One way to prevent this attack is to use a hashtable that has a limit on the number of collisions per table, or per key, and throw an exception when it's exceeded. The problem is that doing so by default would break valid uses in very surprising ways in favor of protecting against an unlikely and probably unknown (when designed) attack, which you won't do, because that shouldn't be handled by basic datatypes.
In fact, why not do this at the language level? python could pick a random character at startup, and use it for the duration of the process.
A longer salt would probably work, but there may well be ways for an attacker to discover the salt in use, and longer salts mean more overhead.
Of course, bump this up to 2-4 bytes and it no longer is.
The interesting part is this is a precomputed attack which relies on hash algorithms being well known and invariant.
Ruby has fixed this by initializing their hash function with a random seed each time an app runs, so an attacker can't do the pre-compute step - a collision calculated on his system won't necessarily be a collision on the server. 
Other workarounds include limiting the size of post requests so the dictionary can't get too big, limiting the CPU time available to a single web request, or using a balanced tree instead of a hash table.
: This is a great link on the problem: http://www.nruns.com/_downloads/advisory28122011.pdf
: Ruby's fix: http://osdir.com/ml/ruby-talk/2009-05/msg00167.html
hash = length of string
for each char c in string
hash = hash ^ ((hash << 5) + (h >> 2) + (unsigned char) c)
This is a well known hash function and it's very different from the one being used by all the other vulnerable languages, but I don't know how hard it is to find collisions for this one.
From what I've been reading, this sounds like something that's not new at all and in fact has been an open vulnerability for years.
Sorry, I was skipping through it at that point (2/3 of the way through?), I don't know what their conclusion was. I think they said that it was partially different but still likely attackable, but I didn't catch it all.
Does anybody closer to the matter know if Rack's actually vulnerable in 1.9?
Rack introduced a workaround in https://github.com/rack/rack/commit/5b9d09a81a9fdc9475f0ab00...
this may help in some cases when an interpreter fix is not available/installable but not fix the hashing problem in general, especially outside of POST params.