I am a new user of openMP, I have a intel i7-2670QM CPU with 8 cores on a linux ubuntu 13.10 system
My program uses nested parallelism in C to create the sum of 8 threads. As I understand it, every thread should run on it's own processor, but when I run the command top on the terminal I see that my program uses only 100% of memory (800% is expected), and in the processor view, only CPU[X] uses 100% (X is random between 0 and 7) and the other CPUs are 0.1%.
When I profile my program with Intel vtune amplifier, it shows that 7 threads were runing, but 6 of them don't use the CPU at all as they were completely IDLE.
When I try another example parallel program the threads split just fine on the cores, so I think the problem is in my code:
#include <omp.h>
void recursive_function(int k)
{
........
recursive_function(...);
}
int main()
{
omp_set_nested(1);
#pragma omp parallel for num_threads(4)
for(i=0;i< width * height;i++)
{
#pragma omp critical
{
......
// 3 simple instructions
}
if(i!=0)
{
recursive_function(i);
}
else
{
int j;
#pragma omp parallel for num_threads(4)
for(j=i;j< width * height;j++)
{
recursive_function(j);
}
}
}
}
execution is made with gcc and the option -fopenmp