8

I wanted to map data with pointer as the key. What container should I have chosen, map or unordered_map? There are multiple questions on stackoverflow for this topic but none of them covers the performance aspect when we need to iterate over all the key-value pairs.

std::map<classKey* , classData*> myMap;
std::unordered_map<classKey* , classData*> myUnorderedMap;

for (auto & iter : myMap) { //loop1
    display(iter.second);
}

for (auto & iter : myUnorderedMap) { //loop2
    display(iter.second);
}

loop1 vs loop2 which one gives better performance. Bench Mark Provided by @RetiredNinja

For size = 10,000,000 We get following benchmark results:

enter image description here

19
  • 4
    Interesting. Off had I have no idea. I'd put my money on std::map though. Are you able to set up some test cases for profiling? Commented Jun 30, 2019 at 14:28
  • 2
    Structurally, I would expect std::map to be slightly faster, but that the difference would be small enough to be swamped by other concerns (like whether the elements of either one were in your cache) Commented Jun 30, 2019 at 14:31
  • 5
    Show how you benchmarked this please. Commented Jun 30, 2019 at 14:33
  • 4
    @MichaelChourdakis again, OP asks about performance, not efficiency regarding time complexities. Commented Jun 30, 2019 at 14:35
  • 3
    Quick Bench is outstanding; have used it to great effect (if I do say so myself) Commented Jun 30, 2019 at 16:17

1 Answer 1

7

As you might expect, this depends heavily on the actual implementation of the standard library data structures. Therefore, this answer will be a bit more theoretical and less tied to any one implementation.

A std::map uses a balanced binary tree under the covers. This is why it has O(log(n)) insertion, deletion, and lookup. Iterating over it should be linear because you just have to do a depth-first traversal (which will require O(log(n)) memory in the form of stack space). The benefit of using a std::map for iteration is that you will iterate over the keys in sorted order and you will get that benefit "for free".

A std::unordered_map uses a hash table under the covers. This allows you to have an amortized constant time insertion, deletion, and lookup. If the implementation is not optimized for iterating, a naive approach would be to iterate over every bucket in the hash table. Since a good hash table (in theory) has exactly one element in 50% of the buckets and zero in the rest, this operation will also be linear. However, it will take more "wall clock time" than the same linear operation for a std::map. To get around this, some hash table implementations keep a side list of all of the elements for fast iterations. If this is the case, iterating on a std::unordered_map will be faster because you can't get much better than iterating over contiguous memory (still linear time though, obviously).

In the extremely unlikely case that you actually need to optimize to this level (instead of just being curious about the performance in theory), you likely have much bigger performance bottlenecks elsewhere in your code.

All of this ignores the oddity of keying off of a pointer value, but that's neither here nor there.

Sources for further reading:

GCC std::map implementation

GCC std::unordered_map implementation

How GCC std::unordered_map achieves fast iteration

Sign up to request clarification or add additional context in comments.

9 Comments

"A std::map uses a balanced binary tree under the covers", "A std::unordered_map uses a hash table under the covers." Any reliable source for these claims?
Admittedly, it is a broad statement since there is nothing explicitly in the standard to require those design choices. However, both the GCC and MS VS compilers implement them this way (as do most well known compilers).
@πάνταῥεῖ humanity's inability to (so far) come up with better implementations satisfying standard's requirements combined with no contraindications to implement the structures in such a way is, for me, enough to assume this (or close to this) implementation. Yes, the standard says nothing about the concrete implementations, but what else do we got?
Your last source says that the hash map keeps a linked list of iterators(= pointers ) to buckets. That is not contiguous at all(2 indirections). Only buckets will be. And there's no reason why the map cannot do the same with nodes. I would really like to see some measurements as an answer.
"Iterating over it should be linear because you just have to do a depth-first traversal (which will require O(log(n)) memory in the form of stack space)." - That's not how std::map iterators work. They are external iterators and cannot allocate any stack space. Incrementing a tree iterator is only amortized constant time. (Iterating over the entire tree is still linear.) See here: github.com/gcc-mirror/gcc/blob/…
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.