Race condition in opencl kernel threads

Question

If multiple threads are simultaneously writing a single memory location.,there will be a race condition,right?? In my case same is happening..

Consider a module from 'reduce.cl'

int i = get_global_id(0);
int n,j;

n = keyMobj[i];                       // this n is the key..It can be either 0 or 1.
for(j=0; j<2; j++)
      sumMobj[n*2+j] += dataMobj[i].dattr[j];        //summing operation.

Here, The memory locations
sumMobj===> [...0..., ....1...] is accessed 4 threads simultaneously & sumMobj===> [....3..., ....4...] is accessed 6 threads simultaneously..

Is there any way to still make it parallely,like using locking or semaphore? As this summing is a very big part in my algorithm...

these are the definitions of sumMobj and dataMobj typedef struct data { double dattr[10]; int d_id; int bestCent; }Data; Data *dataMboj; and double *sumMobj = (double *)malloc(sizeof(double) * 2 * 2); — sandeep.ganage
– sandeep.ganage, Commented Jan 21, 2013 at 9:08
@talonmies It is actually parallel addition problem..in opencl kernel. I just don't know the feasible solution. — sandeep.ganage
– sandeep.ganage, Commented Jan 21, 2013 at 9:13
Why don't you use barrier if you suspect there is a race condition ? like barrier(CLK_LOCAL_MEM_FENCE); — kiranputtur
– kiranputtur, Commented Jan 21, 2013 at 9:27
@ocluser I am having multiple threads which are simultaneously accessing(writing) a single memory location. Would this function 'barrier(CLK_LOCAL_MEM_FENCE);' be useful in this case? I have not used it before. — sandeep.ganage
– sandeep.ganage, Commented Jan 21, 2013 at 11:07
found this page which explains a method for atomically adding floating point numbers, however you would need to use the cl_khr_int64_base_atomics, and use unions of longs and doubles. — Slicedpan
– Slicedpan, Commented Jan 23, 2013 at 16:51

Michał Walenciak · Accepted Answer · 2016-08-10 02:04:51Z

3

I can give you some hint as I was also facing similar problem.

I can think of three different methods for achieving similar goal:

Consider a simple kernel, assuming you launched 4 (0-3) threads

_kernel void addition (int *p)
{
int i = get_local_id(0);
     p[4]+= p[i];
}

You want to add values p[0], p[1], p[2], p[3], p[4], and store the final sum in p[4]. right? i.e:

p[4]= p[0] + p[1] + p[2] + p[3] + p[4]

Method -1 (no parallelism)

Assign this job to only 1 thread (no parallelism):

int i = get_local_id(0);
if (i==0)

{

p[4]+= p[i];

}

Method-2 (with parallelism)

Express your problem as follows:

p[4]= p[0] + p[1] + p[2] + p[3] + p[4] + 0

This is a reduction problem

So launch 3 threads: i=0 to i=2. In first iteration

 i=0 finds p[0] + p[1]
 i=1 finds p[2] + p[3]  
 i=2 finds p[4] + 0

Now you have three numbers, you apply the same logic as above and add these numbers (with suitable padding of 0 to make it in power of two)

Method -3 Atomic operations

If you still need to implement this atomically, you can use atomic_add():

  int fsfunc atomic_add (   volatile __global int *p ,int val)

Description

Read the 32-bit value (referred to as old) stored at location pointed by p. Compute (old + val) and store result at location pointed by p. The function returns old.

This is assuming the data is int type. Otherwise you can see the link as suggested above.

edited Aug 10, 2016 at 2:04

Michał Walenciak

4,4094 gold badges38 silver badges66 bronze badges

answered Feb 6, 2013 at 7:16

gpuguy

4,70718 gold badges75 silver badges131 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

sandeep.ganage Over a year ago

It means if I am using floating point operations,then I don't any other option than using the reduction method, as there isn't any provision of extensions for floating point based atomic operations (as far as my information in concerned). Am I right?

gpuguy Over a year ago

Apart from reduction you can also see the link posted by @Slicedpan

Collectives™ on Stack Overflow

Race condition in opencl kernel threads

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related