C++11 std::unordered_set and map are slower than a naive implementation

phunge · on Dec 28, 2014

Just posted this on the OP:

There's a big issue with the unordered_map.erase(const_iterator) method -- STL mandates that .erase() returns an iterator to the next item in the container. There's no simple implementation of that for hashtables other than to walk the hashtable, which is quite slow.

This was once a serious performance issue with unordered_map (discussion here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=41975). I haven't followed up in years, but is the issue?

jandrewrogers · on Dec 28, 2014

These containers are problematically slow for most use cases but it is not really their fault. An internal design that will give good performance is hugely dependent on the use case and the C++ STL offers little ability to specify the nature of the use case so that appropriate internals can be constructed. Consequently, the default implementation will always be pathologically poor for some significant subset of use cases.

A similar problem exists for other types of STL containers but it is particularly acute for the map and set containers. Some programmers end up building their own library of map implementations, each implementation optimized for a different use case, to get around this problem. The number of template parameters you would need to generalize std::map would be ridiculous, so it is easier to just have a lot of specialized implementations.

For high-performance code, std::map and friends are usually the wrong choice. Personally, I always write usage optimized maps as needed these days; the first couple times seem tedious but it quickly turns into a simple idiom you can implement in your sleep.

blahedo · on Dec 28, 2014

Also interesting is the apparent fact that map and set (the tree-based ones) grow at faster than logarithmic time---if they were true O(lg n) then (given the log scale on the x axis) they should appear linear on these graphs.

It would be nice to see the specific code used to run the benchmarks, though.

julian37 · on Dec 28, 2014

The benchmark code is linked to from the article: http://svn.clifford.at/handicraft/2014/hashlib/