0

I'm learning OpenMP. To do so, I'm trying to make an existing code parallel. But I seems to get an worse time when using OpenMP than when I don't.

My inner loop:

    #pragma omp parallel for
    for(unsigned long j = 0; j < c_numberOfElements; ++j)
    {
        //int th_id = omp_get_thread_num();
        //printf("thread %d, j = %d\n", th_id, (int)j);

        Point3D current;
        #pragma omp critical
        {
            current = _points[j];
        }

        Point3D next = getNext(current);

        if (!hasConstraint(next))
        {
            continue;
        }

        #pragma omp critical
        {
            _points[j] = next;
        }
    }

_points is a pointMap_t, defined as:

typedef boost::unordered_map<unsigned long, Point3D> pointMap_t;

Without OpenMP my running time is 44.904s. With OpenMP enabled, on a computer with two cores, it is 64.224s. What I am doing wrong?

4 Answers 4

1

Why have you wrapped your reads and writes to _points[j] in critical sections ? I'm not much of a C++ programmer, but it doesn't look to me as if you need those sections at all. As you've written it (uunamed critical sections) each thread is going to wait while the other goes through each of the sections. This could easily make the program slower.

Sign up to request clarification or add additional context in comments.

1 Comment

It segfaults without it. I don't think I can alter the map from two different threads concurrently.
0

It seems possible that the lookup and write to _points in critical sections is dragging down the performance when you use OpenMP. Single-threaded, this will not result in any contention.

Sharing seed data like this seems counterproductive in a parallel programming context. Can you restructure to avoid these contention points?

8 Comments

There is another way to do it? I'm not used to parallel programming.
@Vitor - I cannot see why it would fault if you just remove the critical sections. Something in the code we cannot see maybe? _points is not being added to or removed from while the loop executes, right? Do the function calls have side effects outside the current entry? Where is the exception?
Even if you remove the critical sections, it may still cause massive slow-down. See e.g. drdobbs.com/go-parallel/article/….
@Steve Townsend From gdb: [New Thread 0x7ffff0ccd710 (LWP 13329)] *** glibc detected *** /home/vitorpy/openmp/test: double free or corruption (fasttop): 0x00000000027f8be0 ***. bt shows it's occurs on the unordered_map operator[]. Function calls have no side effects.
@Oli Charlesworth I'll take a look at it. Thank you.
|
0

You need to show the rest of the code. From a comment to another answer, it seems you are using a map. That is really a bad idea, especially if you are mapping 0..n numbers to values: why don't you use an array?

If you really need to use containers, consider using the ones from the Intel's Thread Building Blocks library.

2 Comments

You're right. The map is mostly a leftover from an older design where ID were not always sequential. I've made it into an array. It's still slow but I think I'm probably bumping into the false sharing issue or something.
@Vitto Py: please try to create a small compilable code sample that shows the problem. You have simplified the code so much that you have abstracted the problem away. One thing to try would be to make "hasContraint" as static inline, to avoid paying the penalty of a function call.
0

I agree that it would be best to see some working code.

The ultimate issue here is that there are criticals within a parallel region, and criticals are (a) enormously expensive in and of themselves, and (b) by definition, kill parallelism. The assignment to current certainl doesn't need to be inside a critical, as it is private; I wouldn't have thought the _points[j] assignment would be, either, but I don't know what the map stuff does, so there you go.

But you have a loop in which you have a huge amount of overhead, which grows linearly in the number of threads (the two critical regions) in order to do a tiny amount of actual work (walk along a linked list, it looks like). That's never going to be a good trade...

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.