I have an outer for loop that I have parallelized using OpenMP. However within this for loop there are sections of code that can also be executed in parallel.
Can I use OpenMP's sections clause to parallelize this? Is this even possible? Since each iteration of the for loop is run by just one thread, can I (within each iteration), ask for certain sections of code to be run by multiple threads in parallel? Rest of the code should just be run by one thread i.e the thread to which that loop iteration has been assigned.
For ex. I have the following piece of code:
omp_p = omp_get_max_threads();
omp_set_nested(1);
#pragma omp parallel for num_threads(omp_p/2)
for(int p=0;p<omp_p/2;p++){
size_t a = (p*N)/(omp_p/2);
size_t b = ((p+1)*N)/(omp_p/2);
for(int i=a;i<b;i++){
/*Work on A[a]->A[b]*/
for(int j=0;j<n;j++){
for(int k=0;k<N;k++){
/*Serial code*/
#pragma omp parallel sections
{
#pragma omp section
{
}
#pragma omp section
{
}
}
/*Serial work*/
#pragma omp parallel sections
{
#pragma omp section
{
}
#pragma omp section
{
}
}
/*Serial code*/
}
}
}
}
This causes the program to go much much slower than if I hadn't used the parallel sections at all..
i,jandkloop counters get the default sharing class ofsharedand should be explicitly declaredprivate.Ntoo low in comparison to the number of threads?