segmentation fault when using omp parallel for, but not sequentially

Question

I'm having trouble using the #pragma omp parallel for

Basically I have several hundred DNA sequences that I want to run against an algorithm called NNLS.

I figured that doing it in parallel would give me a pretty good speed up, so I applied the #pragma operators.

When I run it sequentially there is no issue, the results are fine, but when I run it with #pragma omp parallel for I get a segfault within the algorithm (sometimes at different points).

#pragma omp parallel for
for(int i = 0; i < dir_count; i++ ) {

  int z = 0;
  int w = 0;
  struct dirent *directory_entry;
  char filename[256];

  directory_entry = readdir(input_directory_dh);

  if(strcmp(directory_entry->d_name, "..") == 0 || strcmp(directory_entry->d_name, ".") == 0) {
    continue;
  }

  sprintf(filename, "%s/%s", input_fasta_directory, directory_entry->d_name);

  double *count_matrix = load_count_matrix(filename, width, kmer);

  //normalize_matrix(count_matrix, 1, width)
  for(z = 0; z < width; z++) 
    count_matrix[z] = count_matrix[z] * lambda;

  // output our matricies if we are in debug mode
  printf("running NNLS on %s, %d, %d\n", filename, i, z);
  double *trained_matrix_copy = malloc(sizeof(double) * sequences * width);
  for(w = 0; w < sequences; w++) {
    for(z = 0; z < width; z++) {
      trained_matrix_copy[w*width + z] = trained_matrix[w*width + z];
    }
  } 

  double *solution = nnls(trained_matrix_copy, count_matrix, sequences, width, i);


  normalize_matrix(solution, 1, sequences);
  for(z = 0; z < sequences; z++ )  {
    solutions(i, z) = solution[z]; 
  }

  printf("finished NNLS on %s\n", filename);

  free(solution);
  free(trained_matrix_copy);
}

gdb always exits at a different pint in my thread, so I can't figure out what is going wrong.

What I have tried:

allocating a copy of each matrix, so that they would not be writing on top of eachother
using a mixture of private/shared operators for the #pragma piece
using different input sequences
writing out my trained_matrix and count_matrix prior to calling NNLS, ensuring that they look OK. (they do!)

I'm sort of out of ideas. Does anyone have some advice?

MutantTurkey · Accepted Answer · 2013-05-01 13:41:59Z

3

Solution: make sure not use static variables in your function when multithreading (damned f2c translator)

answered May 1, 2013 at 13:41

MutantTurkey

5361 gold badge5 silver badges14 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

brkorkut · Accepted Answer · 2013-04-29 21:28:28Z

1

Defining "#pragma omp parallel for" is not going to give you what you want. Based on the algorithm you have, you must have a solid plan on which variables are going to shared and which ones going to private among the processors.

Looking at this link should give you a quick start on how to correctly share the work among the threads.

Based on your statement "I get a segfault within the algorithm (sometimes at different points)", I would think there is a race condition between the threads or improper initialization of variables.

answered Apr 29, 2013 at 21:28

brkorkut

463 bronze badges

1 Comment

MutantTurkey Over a year ago

from what I understand, variables declared locally are automatically private. Even adding shared(trained_matrix) doesn't solve the problem. Thank you for the quick-sheet it's really awesome!

Arch D. Robison · Accepted Answer · 2013-04-30 16:40:18Z

1

Function readdir is not thread safe. To quote the Linux man page for readdir(3):

The data returned by readdir() may be overwritten by subsequent  calls  to  readdir()
for the same directory stream.

Consider putting the calls to readdir inside a critical section. Before leaving the critical section, copy the filename returned from readdir() to a local temporary variable, since the next thread to enter the critical section may overwrite it.

Also consider protecting your output operations with a critical section too, otherwise the output from different threads might be jumbled together.

answered Apr 30, 2013 at 16:40

Arch D. Robison

4,0892 gold badges19 silver badges29 bronze badges

1 Comment

MutantTurkey Over a year ago

gdb does not indicate that the error is within readdir, and the files are getting read properly, the error is actually within the algorithm call

mach6 · Accepted Answer · 2019-11-26 21:10:13Z

0

A very possible reason is the stack limit. As MutantTurkey mentioned, if you have a lot of static variables (like a huge array defined in subroutine), they may use up your stack.

To solve this, first run ulimit -s to check the stack limit for the process. You can use ulimit -s unlimited to set it as ulimited. Then if it still crashes, try to increase the stack for OPENMP by setting OMP_STACKSIZE environmental variable to a huge value, like 100MB.

Intel has a discussion at https://software.intel.com/en-us/articles/determining-root-cause-of-sigsegv-or-sigbus-errors. It has more information of stack and heap memory.

answered Nov 26, 2019 at 21:10

mach6

4378 silver badges7 bronze badges

Collectives™ on Stack Overflow

segmentation fault when using omp parallel for, but not sequentially

4 Answers 4

Comments

1 Comment

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

1 Comment

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related