std::unordered_map vs std::map have different performances depending on the compiler

Question

I have integer keys which have to be associated to std::vector<T>, with T an iterator-like object (you can safely assume each element of the vector is a pointer).

The obvious candidate from the STL is a std::unordered_map<int,vector<T>>, mainly because I should not get hash collisions as my keys are just integers. In my particular case each key maps to a vector with potentially "many" elements which are pushed back. To give an example with a std::unordered_map<int,vector<double>>

constexpr size_t map_size = 360;
constexpr size_t vec_size = 1000;

std::unordered_map<int, std::vector<double>> grid_unmap_int;
for (size_t i = 0; i < map_size; ++i)
  {
    for (size_t j = 0; j < vec_size; ++j)
      {
        grid_unmap_int[i].push_back(j);
      }
  }

If I compare the performance of the insertion using the code above with a std::map<int,std::vector<double>>, I get quite different results which seems to depend on the chosen compiler. In particular, the result for the std::unordered_map using clang turns out to be much faster. Why does the unordered_map behaves so poorly with GCC ? I am afraid it's related to the fact my values are vector with no known size...

Here's a link to quick_bench: https://quick-bench.com/q/fgQ9XuZ9tmeXcKYEGuJOW165FtE

Here are the associated results (-std=c++17 and -O3) Output with GCC 9.4:

Output with CLANG 14.0:

@Someprogrammerdude: Yes, that's the question. Both compilers are using libstdc++ (their screenshot shows the clang version wasn't set to use libc++), so it's the same C++ implementation of those classes, compiled by different compilers, unless header versions differ. And they're asking why those compilers make differently-performing asm for the x86-64 cloud instances Quick-Bench runs them on. At least I hope they realize that's what the question boils down to. — Peter Cordes
– Peter Cordes, Commented Dec 3, 2023 at 12:16
Yes, it's implementation-dependent, but what I'd like to understand is indeed what is the reason for that slowdown. — FEGuy
– FEGuy, Commented Dec 3, 2023 at 12:16
The reallocation was actually intentional because in my case I don't even have an estimate for the capacity. — FEGuy
– FEGuy, Commented Dec 3, 2023 at 12:16
I'm curious if the quick-bench empty-loop baseline ran the same speed on both runs; if so, GCC's std::map was over 3x faster than clang's. Would be worth trying on an idle desktop where we don't have to worry about different runs being on different cloud hardware, and on different levels of competing load (which quick-bench tries to factor out by only showing performance relative to an empty loop that it tested on the same instance.) — Peter Cordes
– Peter Cordes, Commented Dec 3, 2023 at 12:23
Is it intentional that your benchmark appends to the same bucket inside the inner loop, giving the compiler a chance to hoist the map/unordered_map lookup and just grow a std::vector? Or at least reuse the hash, if it's not fully hoisting. — Peter Cordes
– Peter Cordes, Commented Dec 3, 2023 at 23:59

Dmitry T. · Accepted Answer · 2025-04-05 12:51:21Z

0

If you lower the vec_size to say 1, you'll see that the difference is not so significant. So the actual cause should be related to the std::vector::push_back() related code.

I assume the clang just effectively caches the grid_unmap_int[i] value. The operator[] is non-const so it can modify a map and this fact can make a big difference.

answered Apr 5 at 12:51

Dmitry T.

7117 silver badges7 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

std::unordered_map vs std::map have different performances depending on the compiler

1 Answer 1

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related