I have this OpenMP code that performs a simple reduction:
for(k = 0; k < m; k++)
{
#pragma omp parallel for private(i) reduction(+:mysum) schedule(static)
for (i = 0; i < m; i++)
{
mysum += a[i][k] * a[i][k];
}
}
I want to create a code equivalent to this one, but using OpenMP Tasks. Here is what I tried by following this article:
for(k = 0; k < m; k++)
{
#pragma omp parallel reduction(+:mysum)
{
#pragma omp single
{
for (i = 0; i < m; i++)
{
#pragma omp task private(i) shared(k)
{
partialSum += a[i][k] * a[i][k];
}
}
}
#pragma omp taskwait
mysum += partialSum;
}
}
The variable partialSum is declared as threadprivate and it's also a global variable:
int partialSum = 0;
#pragma omp threadprivate(partialSum)
a is a simple array of ints (m x m).
The problem is that when I run the code above (the one with tasks) multiple times, I get different results.
Do you have an idea on what should I change to make this work?
Thank you respectfully
partialSumis shared among all your threads. The reduction handles making private copies ofmysumand combining them at the end, but the same treatment is not extended topartialSum, which therefore is the subject of a data race. The slide deck you linked uses athreadprivate()directive to address that problem. I'm not certain that would be sufficient for you, but it would at least resolve the data race.partialSumis shared among all threads because I also declare it asthreadPrivate, exactly as in that articlepartialSumisthreadPrivatefrom the beginning. I think that you should have read the entire question from the beginning.