8

This minimal OpenMP program

#include <omp.h>
int main() 
{
  #pragma omp parallel sections
  {
    #pragma omp section
    {
      while(1) {}
    }

    #pragma omp section
    {  
      while(1) {}
    }
  }
}

will produce this error when compiled and run with gcc test.c -fopenmp:

Illegal instruction (core dumped)

When I change either one of the loops with

  int i=1;
  while(i++) {}

or any other condition it compiles and runs without error. It seems, that 1 as a loop condition in different threads cause some strange behaviour. Why?

edit: I am using gcc 4.6.3

edit: This is a bug in gcc and was submitted as Bug 54017 to the gcc developers.

7
  • 3
    I can confirm that I get the same behaviour when I compile and run this code. Commented Jul 18, 2012 at 13:43
  • Works OK with my gcc-4.6, opencc-5.0 and suncc 12.3. I would guess a compiler bug in your older gcc. What version? What platform? Commented Jul 18, 2012 at 14:01
  • 1
    What does the assembly output look like for both examples (gcc -S)? Commented Jul 18, 2012 at 14:35
  • 1
    This is a bug in GCC. It is even present in 4.7.0. Please, do good to the society and report the error to the GCC people. Commented Jul 18, 2012 at 15:29
  • 1
    Rejoice as the upcoming GCC 4.7.2 would be able to compile this correctly :) Commented Jul 20, 2012 at 9:21

2 Answers 2

8

This is apparently a bug in GCC. GCC implements OpenMP sections using the GOMP_sections_start() routine from libgomp that returns a 1-based section ID that the calling thread should execute or 0 if all work items have been distributed. Basically the transformed code should look like:

main._omp_fn.0 (void * .omp_data_i)
{
   unsigned int .section.1;

   .section.1 = GOMP_sections_start(2);
L0:
   switch (.section.1)
   {
      case 0:
         // No more sections to run, exit
         goto L2;
      case 1:
         // Do section 1
         while (1) {}
         goto L1;
      case 2:
         // Do section 2
         while (1) {}
         goto L1;
      default:
         // Impossible section value, possible error in libgomp
         __builtin_trap();
   }
L1:
   .section.1 = GOMP_sections_next();
   goto L0;
L2:
   GOMP_sections_end_nowait();
   return;
}

What happens is that in your case the both the default and the 0 case lead to __builtin_trap(). __builtin_trap() is a GCC built-in that is supposed to terminate your program abnormally and on x86 it emits the ud2 instruction that makes the CPU to bark with an illegal opcode exception. It is usually put in places where code should never execute, e.g. all possible correct return values from GOMP_sections_start() and GOMP_sections_next() should be covered by the cases in the switch and if the default is reached (signalling a possible bug in libgomp) it should fail and you will complain to the developers :)

Edit: This is definitely not expected OpenMP behaviour and it does not happen with icc or suncc. I have submitted Bug 54017 to the GCC Bugzilla.

Edit 2: I updated the text to more closely reflect what GCC should produce. It looks like GCC is getting wrong impression of the control flow in the parallel region and does some "optimisations" that further spoil code generation.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks, I was in the train without Internet.
4

SIGILL generated, because there is an illegal instruction, ud2/ud2a. According to http://asm.inightmare.org/opcodelst/index.php?op=UD2:

This instruction caused #UD. Intel guaranteed that in future Intel's CPUs this instruction will caused #UD. Of course all previous CPUs (186+) caused #UD on this opcode. This instruction used by software writers for testing #UD exception servise routine.

Let's look inside:

$ gcc-4.6.2 -fopenmp omp.c -o omp
$ gdb ./omp
...

(gdb) r
Program received signal SIGILL, Illegal instruction.
...
0x08048544 in main._omp_fn.0 ()
(gdb) x/i $pc
0x8048544 <main._omp_fn.0+28>:  ud2a

(gdb) disassemble
Dump of assembler code for function main._omp_fn.0:
0x08048528 <main._omp_fn.0+0>:  push   %ebp
0x08048529 <main._omp_fn.0+1>:  mov    %esp,%ebp
0x0804852b <main._omp_fn.0+3>:  sub    $0x18,%esp
0x0804852e <main._omp_fn.0+6>:  movl   $0x2,(%esp)
0x08048535 <main._omp_fn.0+13>: call   0x80483f0 <GOMP_sections_start@plt>
0x0804853a <main._omp_fn.0+18>: cmp    $0x1,%eax
0x0804853d <main._omp_fn.0+21>: je     0x8048548 <main._omp_fn.0+32>
0x0804853f <main._omp_fn.0+23>: cmp    $0x2,%eax
0x08048542 <main._omp_fn.0+26>: je     0x8048546 <main._omp_fn.0+30>
0x08048544 <main._omp_fn.0+28>: ud2a
0x08048546 <main._omp_fn.0+30>: jmp    0x8048546 <main._omp_fn.0+30>
0x08048548 <main._omp_fn.0+32>: jmp    0x8048548 <main._omp_fn.0+32>
End of assembler dump.

There is ud2a in assembler file already:

$ gcc-4.6.2 -fopenmp omp.c -o omp.S -S; cat omp.S

main._omp_fn.0:
.LFB1:
        pushl   %ebp
.LCFI4:
        movl    %esp, %ebp
.LCFI5:
        subl    $24, %esp
.LCFI6:
        movl    $2, (%esp)
        call    GOMP_sections_start
        cmpl    $1, %eax
        je      .L4
        cmpl    $2, %eax
        je      .L5
                .value  0x0b0f

.value 0xb0f is code of ud2a

After verifying that ud2a was inserted by intention of gcc (at early openmp phases), I tried to understand the code. The function main._omp_fn.0 is the body of parallel code; it will call _GOMP_sections_start and parse its return code. If code equal to 1 then we will jump to one infinite loop; if it is 2, jump to second infinite loop. But in other case ud2a will be executed. (Don't know why, but according to Hristo Iliev this is a GCC Bug 54017.)

I think, this test is good to check how much CPU cores there are. By default GCC's openmp library (libgomp) will start a thread for every CPU core in your system (in my case there were 4 threads). And sections will be selected in order: first section for first thread, second section - 2nd thread and so on.

There is no SIGILL, if I run the program on 1 or 2 CPUs (option of taskset is the cpu mask in hex):

 $ taskset 3 ./omp
 ... running on cpu0 and cpu1 ...
 $ taskset 1 ./omp
 ... running first loop on cpu0; then run second loop on cpu0...

9 Comments

I am not too strong in asm ... what does that tell us?
ud2 is not an illegal instruction. It is perfectly legal x86 instruction that artificually raises the #UD (invalid opcode) exception. It is emitted by the __builtin_trap built-in.
@steffen, please update the page and reread the answer; it was updated
I have an i7, which is a quadcore. I get sigill for OMP_NUM_THREADS greater than the number of sections/infinite loops. So the signal is raised when a thread is left without work. Is that it? Why would gcc do that?
"(Don't know why, but according to Hristo Iliev this is a GCC Bug 54017.)" - read the bug report and you'll know why.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.