OpenMP parallelisation of pi calculation is either slow or wrong

Question

I'm having trouble parallelising my monte carlo method to calculate pi. Here is the parallelised for-loop:

#pragma omp parallel for private(i,x,y) schedule(static) reduction(+:count)
  for (i = 0; i < points; i++) {
    x = rand()/(RAND_MAX+1.0)*2 - 1.0;
    y = rand()/(RAND_MAX+1.0)*2 - 1.0;

    // Check if point lies in circle
    if(x*x + y*y < 1.0) { count++; }
  }

The problem is, it underestimates pi if I use schedule(static), and its slower than the serial implementation if I use schedule(dynamic). What am I doing wrong? I've tried other ways to fix it (like this: Using OpenMP to calculate the value of PI) but it's still much slower than the serial implementation.

Thanks in advance

@Mystical: looks like it isn't: stackoverflow.com/questions/6161322/… — ev-br
– ev-br, Commented Jan 24, 2012 at 17:25
I suspect @Mysticial has the right idea. rand will normally have an internal "seed", which is basically acting as a shared resource, forcing serialization at every call (or else risking incorrect results). If you have it available, I'd try using rand_r or (preferably) drand48_r instead. Alternatively, consider the random number generation classes introduced in C++11 -- each instance has its own state, which should avoid serialization (but can make initialization tricky -- multiple threads creating identical sequences will do little good). — Jerry Coffin
– Jerry Coffin, Commented Jan 24, 2012 at 17:31

derobert · Accepted Answer · 2012-01-24 17:30:10Z

7

Assuming you're using the C library rand function, that function is not reentrant or thread-safe. POSIX provides a rand_r function, but (to quote the glibc documentation):

POSIX.1 extended the C standard functions to support reproducible random numbers in multi-threaded programs. However, the extension is badly designed and unsuitable for serious work.

In particular, the seed must be an unsigned int, which doesn't have enough bits for a good PRNG. They recommend using the SVID random number functions, of which nrand48_r is probably what you're looking for.

Alternatively, you could use a different library.

answered Jan 24, 2012 at 17:30

derobert

51.4k15 gold badges98 silver badges126 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Eddy Over a year ago

How do I use nrand48_r? It doesn't appear to be in the standard library. I can use nrand48(&seed), where seed is a short unsigned integer. But I still get bad performance if I use a different seed for each thread

Stephan Dollberg · Accepted Answer · 2012-01-24 18:00:29Z

1

One thing you have to take into account when doing such things in parallel, is that there might be different rounding errors, caused due to different ways of doing the calculation.

Example:

((A+B) + (C+D)) where (A+B) and (C+D) will be computed in parallel might vary from the serial approach (((A+B) + C) + D).

edited Jan 24, 2012 at 18:00

answered Jan 24, 2012 at 17:54

Stephan Dollberg

34.9k16 gold badges83 silver badges110 bronze badges

Collectives™ on Stack Overflow

OpenMP parallelisation of pi calculation is either slow or wrong

2 Answers 2

1 Comment

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related