
Maps and sets can have quadratic-time performance - ingve
http://lemire.me/blog/2017/01/30/maps-and-sets-can-have-quadratic-time-performance/
======
corysama
A paper came up on HN recently that showed that C arrays could have log(N)
behavior for individual accesses if you antagonized the TLB cache hard enough.
It got more discussion on Reddit than HN though.

[https://arxiv.org/abs/1212.0703](https://arxiv.org/abs/1212.0703)

[https://news.ycombinator.com/item?id=12385458](https://news.ycombinator.com/item?id=12385458)

[https://www.reddit.com/r/programming/comments/509qhd/randoml...](https://www.reddit.com/r/programming/comments/509qhd/randomly_accessing_n_items_in_dram_is_on_in/)

------
panic
Assuming your hash table is power-of-two sized, could your iterator step
forward by (say) 19 elements at a time instead of one? Since 19 is relatively
prime to any power of two, you'll eventually iterate over every element if you
take the remainder each time you hit the end. Is there a way to pick the step
size to ensure quadratic behavior never happens?

EDIT: The bug description at
[https://bugs.swift.org/browse/SR-3268](https://bugs.swift.org/browse/SR-3268)
mentions this solution but dimisses it due to poor cache locality.

------
raphlinus
If I'm reading right, this is the same problem as
[http://accidentallyquadratic.tumblr.com/post/153545455987/ru...](http://accidentallyquadratic.tumblr.com/post/153545455987/rust-
hash-iteration-reinsertion) , only it hasn't been fixed yet in Swift?

~~~
Gankro
Yes. However Rust's precise mitigation only works because it doesn't use
hashcoding (Swift does). A possible mitigation that's similar to Rust's would
be for each Dictionary to post-process hashes with some per-Dictionary
key/salt.

------
captainmuon
I wonder if there is a way to provoke hash collisions. A while back there was
a bug in Python dicts that let you do that. Construct a bunch of strings that
hash to the same bucket, and feed that to an application that tries to store
them in the dict, and you could cause extreme resource consumption IIRC. The
remedy was to add a random (per-process) part to the key function. It would be
interesting to see if Swift/ObjC has this vulnerability, too.

------
dgfgfdagasdfgfa
Could we get "swift" added to the title?

Thanks!

------
mywittyname
Alternate proposal: in-order iterating over a set is what's causing the
slowdown.

    
    
        let nanobuild = Swimsuit.nanotime() {
            for i in 1...size {
                d.insert(i)
            }
        }
    

vs.

    
    
        let nanocopy = Swimsuit.nanotime() {
            for i in d {
                dcopy.insert(i)
            }
        }
    

Clearly the biggest difference between the code is how he's getting the values
to insert. If he changed the later loop from for i in d, to the original for i
in 1...size, I suspect he'd see equivalent performance.

This code does not demonstrate the issues with HashMaps outlined in the
article. I'm not familiar with Swift's libraries, but I bet it has a load
factor parameter, the author could just set that to >1.0 and load a bunch of
data into it, then the code would show what the article claims.

~~~
Gankro
There's a load factor in Swift's Dictionary but it's hardcoded. Same for Rust.
(I've maintained both implementations)

I assure you the bug is as described. I wrote it up the initial bug report and
what needed to be done to mitigate it here [1].

Most critically, my analysis on different sizes demonstrates that it's
ultimately a consequence of how hashes are modulo truncated. Increasing the
size without increasing capacity (higher load) can in fact _decrease execution
time_.

[1]
[https://bugs.swift.org/plugins/servlet/mobile#issue/SR-3268](https://bugs.swift.org/plugins/servlet/mobile#issue/SR-3268)

(sorry this comment is poorly written, my phone was about to die; read the
issue for a proper discussion)

------
alistproducer2
I hope this isn't too off topic, but this is the first time I've read swift
and, to my eyes, it looks gorgeous. I had no idea iOS apps could be written in
such a nice looking language. I could never stomach Objective-C.

