openmp with nested loops and function call

Question

I have a few nested loops and I put the first one in parallel mode. apar and mpar are structs whose values are modified in the loop and then function breakLogic is called which generates a struct which i store in a pre created vector of those structs. one, two ... have been declared earlier in the function.

I have tried to include ordered and critical to ensure accuracy but i am still getting incorrect results.

#pragma omp parallel for ordered private(appFlip, atur, apar, mpar, i, j, k, l, m, n) shared(rawFlip)
for(i=0; i<oneL; i++)
    {
         initialize mpar
         #pragma omp critical
         apar.one = one[i];
         for(j=0; j<twoL; j++)
         {
             apar.two = two[j];
             for(k=0; k<threeL; k++)
             {
                  apar.three = floor(three[k]*apar.two);
                  appFlip = applyParamSin(rawFlip, apar);
                  for(l=0; l< fourL; l++)
                  {
                      mpar.four = four[l];
                      for(m=0; m<fiveL; m++)
                      {
                          mpar.five = five[m];
                          for(n=0; n<sixL; n++)
                          {
                              mpar.six = add[n];
                              atur = breakLogic(appFlip,  mpar, dt);
                              #pragma omp ordered
                              {
                                  sinResVec[itr] = atur;
                                  itr++;
                              }
                          }
                      }
                  }
                  r0(appFlip);
              }
         }
    }

Or is this code not conducive for parallelism? Are there any tools for g++ which can profile code for parallel processing and indicate potential issues?

This modified code works but gives no performance improvement.

Note that there are no conditionals in your code, therefore correct itr values can be computed directly instead of using increments in the innermost loop, thus you could get rid of ordered. Then you also need to make apar and mpar private, unless there are members of those structures that are shared between threads. With private variables you can also get rid of the critical constructs. Note that the outermost critical protects the entire loop and therefore the inner critical's are superficial. — Hristo Iliev
– Hristo Iliev, Commented Oct 24, 2013 at 12:14
If you're using g++ then define all your variables when you use them (e.g. for(int i=0; ...) and don't worry about explicitly declaring everything public and private. That's only for people still using ANSI/gnu89 C. Just remember that everything defined inside the parallel construction is private and everything outside is shared. It will make your code a lot cleaner and personally I think easier to understand. And don't compare performance without optimization on. — Z boson
– Z boson, Commented Oct 24, 2013 at 14:46
what do you mean by "don't compare performance without optimization on"? Isn't the whole point of parallelism performance improvement? — nashar1
– nashar1, Commented Oct 24, 2013 at 14:53

kangshiyin · Accepted Answer · 2013-10-24 14:48:40Z

1

You original code can be paralleled by a few modifications.

set apar and mpar as firstprivate. apar and mpar should be thread local variables and be initialized when entering the parallel for region;
remove all critical and ordered clauses, including the one in the parallel for directive. they are not working as your expected;
calculate iter with i,j,k,l,m,n to remove the dependency.

.

iter=(((i*twoL+j)*threeL+k)*fourL+m)*fiveL+n;
sinResVec[itr] = atur;

update

See here for more details of OpenMP, especially the differences between private and firstprivate.

http://msdn.microsoft.com/en-us/library/tt15eb9t.aspx

edited Oct 24, 2013 at 14:48

answered Oct 24, 2013 at 13:13

kangshiyin

9,8291 gold badge19 silver badges30 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

nashar1 Over a year ago

Thanks Eric! Slight error/type on my part, apar.three is dependent on apar.two. Does this change anything?

nashar1 Over a year ago

and keep all i...n as private in parallel construct?

nashar1 Over a year ago

another question? you haven't mentioned anything about atur, appFlip etc. does keeping them private or firstprivate make any difference?

kangshiyin Over a year ago

keeping them private as your existing code is fine. You could set them to firstprivate only if you want to initialize them when entering the parallel for region. Variables like j are explicitly initialized in the code, so firstprivate it is useless.

Collectives™ on Stack Overflow

openmp with nested loops and function call

1 Answer 1

update

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

update

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related