0

One of the files in my project has a for loop that I tried to parallelize using OpenMP for. When I ran it, I got a floating point exception. I couldn't reproduce the error in a separate test program, however, I could reproduce it in the same file using a dummy parallel region (the original for loop had some detailed array computations, hence the dummy code):

#pragma omp parallel for
for(i=0; i<8; i++)
{
  puts("hello world");
}

I still got the same error. Heres the gdb output:

    Program received signal SIGFPE, Arithmetic exception.
[Switching to Thread 0x7ffff4c44710 (LWP 18912)]
0x0000000000402fd4 in allocate_2D_matrix.omp_fn.0 (.omp_data_i=0x0) at main.c:119
119     #pragma omp parallel for

By trial-and-error, I solved the problem by adding a schedule to the openmp construct:

#pragma omp parallel for schedule(dynamic)
    for(i=0; i<8; i++)
    {
      puts("hello world");
    }

and it worked just fine. I could replicate this entire behaviour on 2 different systems (gcc 4.4.5 on 64 bit Linux Mint and gcc 4.5.0 on 64 bit Opensuse). Would anyone have any ideas as to what might have caused it? I strongly suspect it is related to my program, since I couldn't reproduce the error separately, but I dont know where to look at. The problem is solved of course, but I am curious. If need be, I can post the entire original function where I see this behaviour.

7
  • If I am reading this correctly, it seems like the value of "i" is causing the problem. It seems like it is far exceeding the upper bound of 1024. How is "i" declared? My guess would be that something in your loop is stepping outside of its bounds and stepping on the storage for the loop iteration variable. That wouldn't explain the same problem happening when you replace it with a dummy parallel region though. What exactly is reported when you use a dummy parallel region? Commented May 17, 2011 at 18:49
  • i is declared just before the loop as int i; (not global variable) The gdb output that I have shown is the exact output for the dummy loop. Commented May 17, 2011 at 19:02
  • The gdb output 'omp_data_i=0x7fffffffe730' is specifically an indication that something went wrong with the location of i Commented May 17, 2011 at 19:07
  • The gdb output is showing "for(i=0; i<1024; i++)" while I thought your dummy loop was the "for(i=0; i<8; i++) {puts("hello world");}". What am I missing? Commented May 17, 2011 at 20:09
  • I made a mistake, posted the wrong dummy loop. The loop where I see the error does have 1024 iterations, and not 8. I just ran the program again and I get a slightly different gdb output. I have edited the original question. Sorry. Commented May 17, 2011 at 21:03

2 Answers 2

1

most likely puts isnt thread safe. Stick it in critical section and see what happens.

Sign up to request clarification or add additional context in comments.

3 Comments

Same error. It fails with the puts example, but also fails with the original code that had an array computation inside the loop.
@hor is i shared or private?
i is shared by default. I also tried to make it shared/private explicitly in the openmp construct, I still see the error.
1

I had the same issue, it seems to happen when using unsigned ints as loop iteration variables, here is an example that has the problem and the fix:

/* the following code was generating a FPE: */

unsigned int m = A->m ;
unsigned int i,ij ;
NLCoeff* c = NULL ;
NLRowColumn* Ri = NULL;

#pragma omp parallel for private(i,ij,c,Ri) 
for(i=0; i<m; i++) {
    Ri = &(A->row[i]) ;       
    y[i] = 0 ;
    for(ij=0; ij<Ri->size; ij++) {
        c = &(Ri->coeff[ij]) ;
        y[i] += c->value * x[c->index] ;
    }
}

/* and this one does not: */

int m = (int)(A->m) ;
int i,ij ;
NLCoeff* c = NULL ;
NLRowColumn* Ri = NULL;

#pragma omp parallel for private(i,ij,c,Ri) 
for(i=0; i<m; i++) {
    Ri = &(A->row[i]) ;       
    y[i] = 0 ;
    for(ij=0; ij<(int)(Ri->size); ij++) {
        c = &(Ri->coeff[ij]) ;
        y[i] += c->value * x[c->index] ;
    }
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.