0

I've got the following loop:

    while (a != b) {
#pragma omp parallel
        {
#pragma omp for
            // first for 
#pragma omp for
            // second for
        }
    }

In this way the team is created at each loop. Is it possible to rearrange the code in order to have a single team? "a" variable is accessed with omp atomic inside the loop and "b" is a constant.

5
  • Does the condition have any side-effects? Can't you post more complete code? Commented Oct 5, 2014 at 15:07
  • @VladimirF no, it's a simple comparison of two integer values nothing more. The variable "a" is incremented/decremented in one for via some method and the increment and decrement is done via pragma atomic Commented Oct 5, 2014 at 15:22
  • What's wrong with what you have now? And why not just use parallel for on each loop? Commented Oct 5, 2014 at 21:22
  • @Zboson Because I want to remove if possible the cost of fork/join. I know it can be low if compared to the overall processing, but if I can have a little improve it's better. However the ranges of the internal for don't change over the while loop. Commented Oct 7, 2014 at 16:57
  • @greywolf82, my gut tells me it makes no difference due to the thread pool but I can't say for sure in general. But it's easy to test. Does it make a difference in your case? Commented Oct 7, 2014 at 18:19

1 Answer 1

2

The only thing that comes to my mind is something like this:

#pragma omp parallel
{
  while (a != b) {
  #pragma omp barrier 
  // This barrier ensures that threads 
  // wait each other after evaluating the condition
  // in the while loop
  #pragma omp for
  // first for (implicit barrier)
  #pragma omp for
  // second for (implicit barrier)
  // The second implicit barrier ensures that every 
  // thread will have the same view of a
  } // while
} // omp parallel

In this way each thread will evaluate the condition, but every evaluation will be consistent with the others. If you really want a single thread to evaluate the condition, then you should think of transforming your worksharing constructs into task constructs.

Sign up to request clarification or add additional context in comments.

5 Comments

I think if I put a pragma flush(a) at the end of the loop, it should be ok. In this way I'm sure before the next read of the while "a" is updated in every thread. In addition, "a" is modified only in the second for so your first barrier maybe is not needed, what do you think?
@greywolf82 1. A flush region without a list is implied during the barrier of the second loop (so there's no need to write another one) 2. I would put the barrier at the beginning of the while loop, to avoid future problems if the code inside the for loop will be modified.
Ah ok your solution seems to work. Thank you very much.
@Zboson Better?!? To me it's just a while loop inside a parallel region instead of a parallel region inside a while loop :-)
What I mean is that due to thread pools the threads are not created/destroyed repeatedly so the only thing it seems to me your solution does is remove the cost due to fork/join. But the fork/join cost should be negligible for parallel for loops because the range should be large normally. I guess if the parallel for loops have variable ranges for each while loop iteration, some of which are small, then your solution would make sense. But in that case the OP should probably rethink how best to parallelize his code.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.