I heard a story recently about that being exploited as an attack.
Attackers knew some site used some DB that used a specific hash and items that hash to the same value end up in a linked list. So, the submitted a ton of user data that all hashed to the same value then DOS’d with requests that ran through the whole linked list every time.
It's called an algorithmic complexity attack, and yeah it's a known issue with hashmaps. One solution is to us a parametrized hash function with a secret key. And a second is to use another data structure entirely.
It’s actually called a hashmap collision attack, fixed by randomizing your hash function at the program start. For example generating a key at program start that will be used with siphash.
I like how you’ve been saying for 3+ years that it’s possible to recover a SipHash key based on its output, something it’s specifically designed not to allow, but never given any evidence for it.
It doesn’t even have to be that complicated. Query parameters for HTTP requests are almost always stuffed into a hash table. You can get up to mischief a couple of packets at a time.
If you’ve been watching, there have been a series of articles over the last five or so years as each language upgrades their hash table implementation to thwart this class of attack.
Attackers knew some site used some DB that used a specific hash and items that hash to the same value end up in a linked list. So, the submitted a ton of user data that all hashed to the same value then DOS’d with requests that ran through the whole linked list every time.