Hacker News new | past | comments | ask | show | jobs | submit login

I don't know if it has a particular name, but it is a common problem: fast average case, slow worst case.

A common example is the quicksort algorithm. It is often considered the fastest sorting algorithm in the general case but it is still O(n²) in the worst case. According to statistics, it should never happen in practice, however, with cleverly ordered data, it can. That's why some libraries prefer to use a slightly slower algorithm (ex: heap sort) but with a O(n.log(n)) guarantee, so that it can't be DDoSed.

Unbalanced binary trees can have the same problem, on random data, they work fine, but improperly used, they degenerate into much slower linked lists. Hash tables can also be exploited if they don't used a cryptographically secure hash function, and these are rarely used because they can be so slow as to negate the advantage of using a hash table in the first place.

>cryptographically secure hash function

I doubt that would help you. Hash tables take the hash and mod it by the size of the array. That will lead to collisions after the mod even if it's essentially impossible to generate collisions before the mod.

What you actually need is a secret hash function, so attackers don't know what the values hash to. This is done using a keyed hash function with a secret key.

As to sorting, a lot of libraries use timsort these days since it's more optimal than quicksort in the real world and has nlogn worst case. Or b trees for big storage and sorting, but that's generally a different use case (databases et al).

If your threat model is "users who create hash collisions to slowly encumber data operations", you've either royally pissed someone off or you have a very determined competitor. No one who cares about performance would deploy a crypto hash function for a dictionary unless they absolutely had to. You can always add more buckets to reduce collision probability. There's an entire class of hash functions dedicated to non crypto purposes (xxhash, murmur, etc).

The name for this in the security literature is "algorithmic complexity attacks". They were first proposed by Crosby & Wallach in 2003, with a running example of degraded hash tables.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact