

C++11 std::unordered_set and map are slower than a naive implementation - julian37
http://www.clifford.at/blog/index.php?/archives/132-C++11-stdunordered_set-and-stdunordered_map-are-slower-than-a-naive-implementation.html

======
phunge
Just posted this on the OP:

There's a big issue with the unordered_map.erase(const_iterator) method -- STL
mandates that .erase() returns an iterator to the next item in the container.
There's no simple implementation of that for hashtables other than to walk the
hashtable, which is quite slow.

This was once a serious performance issue with unordered_map (discussion here:
[https://gcc.gnu.org/bugzilla/show_bug.cgi?id=41975](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=41975)).
I haven't followed up in years, but is the issue?

------
jandrewrogers
These containers are problematically slow for most use cases but it is not
really their fault. An internal design that will give good performance is
_hugely_ dependent on the use case and the C++ STL offers little ability to
specify the nature of the use case so that appropriate internals can be
constructed. Consequently, the default implementation will always be
pathologically poor for some significant subset of use cases.

A similar problem exists for other types of STL containers but it is
particularly acute for the map and set containers. Some programmers end up
building their own library of map implementations, each implementation
optimized for a different use case, to get around this problem. The number of
template parameters you would need to generalize std::map would be ridiculous,
so it is easier to just have a lot of specialized implementations.

For high-performance code, std::map and friends are usually the wrong choice.
Personally, I always write usage optimized maps as needed these days; the
first couple times seem tedious but it quickly turns into a simple idiom you
can implement in your sleep.

------
blahedo
Also interesting is the apparent fact that map and set (the tree-based ones)
grow at faster than logarithmic time---if they were true O(lg n) then (given
the log scale on the x axis) they should appear linear on these graphs.

It would be nice to see the specific code used to run the benchmarks, though.

~~~
julian37
The benchmark code is linked to from the article:
[http://svn.clifford.at/handicraft/2014/hashlib/](http://svn.clifford.at/handicraft/2014/hashlib/)

