12

So on SO, and the Internets in general, there is much confusion and frustration about how to make OpenMP's easy-to-use #pragma directives cooperate with C++'s equally easy-to-use STL containers.

Everyone talks about work-arounds for STL vector, but what about non-random access / bi-directional containers, like map, list, set, etc. ?

I encountered this problem and devised a very simple, obvious workaround. I present it here for STL map, but it is clearly generalizable.

Serial version:

for (std::map<A,B>::iterator it = my_map.begin();
        it != my_map.end();
        ++it)       
    { /* do work with  it   */  }

My proposed solution to use OpenMP with STL map:

    //make an array of iterators.
    int loop_length = my_map.size();
    std::map<A,B>::iterator loop_array[ loop_length ];

    std::map<A,B>::iterator allocate_it = my_map.begin();
    for (int j=0; j<loop_length; ++j)
        loop_array[j] = allocate_it++;

    // now you can use OpenMP as usual:
    #pragma omp parallel for
    for (uint j=0; j<loop_length; ++j) 
       { /* do work with    loop_array[j]    */  }

I am far from an expert on OpenMP, however, so I would like to know if my proposed work-around is efficient and good practice.

Please assume that the programmer is responsible for thread-safe handling of the STL container within the for loop.

Finally, is my proposed solution more efficient than the following commonly-proposed solution (see answer to this SO Question), because, in my solution,each thread does not iterate over the whole container?

#pragma omp parallel
{
    for (std::map<A,B>::iterator it = my_map.begin();
            it != my_map.end();
            ++it) 
    #pragma single nowait
       {   /*  do work  */   }

}
5
  • I would suspect this to be faster (at least when A and B are large, otherwise you could just copy them into a vector). But have you tried it on your problem? Was it faster? Commented May 3, 2012 at 15:04
  • @larsmans I have not done any performance tests, and don't really plan too (sorry). I already have a large, sophisticated program written in serial, with STL containers everywhere, and am trying to multithread certain STL-for-loops. As such, I can't easily isolate and time it... Commented May 3, 2012 at 15:12
  • Not even if you copy the contents of the containers to a vector for testing purposes? Commented May 3, 2012 at 15:13
  • Sorry if my question is stupid but could you not simply iterate over j and then access the elements via allocate_it+j where allocate it is set as in your post. Commented May 4, 2012 at 14:00
  • 2
    @Azrael3000 But isn't iterator arithmetic only valid for random-access iterators (like for vector) ? map , list, set, only use bi-directional iterators. Commented May 4, 2012 at 14:19

1 Answer 1

5

OpenMP provides the task construct starting with version 3.0 which is quite useful for use with STL:

for (std::map<A,B>::iterator it = my_map.begin();
        it != my_map.end();
        ++it)       
{
   #pragma omp task
   { /* do work with  it   */  }
}

Of course, data dependencies between iterations should not exist for this to work.

Sign up to request clarification or add additional context in comments.

4 Comments

Is there extra overhead due to have the OpenMP command inside the loop? In general, I thought that performance is best improved by parallelizing the largest section of code possible (e.g. the outermost loop in nested loops).
That's generally true but using omp for on the for loop requires fast random access iterators (e.g. that run in constant time). As far as I know std::map does not provide a random access iterator at all and you should simulate one.
Yeah, that's what I was trying to do in my "proposed solution", by creating the conventional (random-access) array and then looping for that instead of the map directly. I was just concerned that your method created too much overhead (thread management at every iteration).
OpenMP does not create separate thread for each task. On the contrary, tasks are put in a "bag" and then each thread steals a task. My colleagues' research shows that the overhead is almost negligible while coding is greatly simplified. I am not trying to persuade you to use tasks - just try and see which one is faster and which one requires more programming.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.