0

I am trying to implement simple parallel reduction. I am using the code from the CUDA SDK. But somehow there is a problem in my kernel as the shared array is not getting values of the global array and its all zeroes.

extern __ shared __ float4 sdata[];

// each thread loads one element from global to shared mem

unsigned int tid = threadIdx.x;
unsigned int i = blockIdx.x * blockDim.x + threadIdx.x;

sdata[tid] = dev_src[i];

__syncthreads();

// do reduction in shared mem

for(unsigned int s = 1; s < blockDim.x; s *= 2) {
    if(tid % (2*s) == 0){
        sdata[tid] += sdata[tid + s];
    }
    __syncthreads();
}

// write result for this block to global mem
if(tid == 0)
    out[blockIdx.x] = sdata[0];

Edit:

Ok I got it working by removing the extern keyword and making the shared array a constant size like 512. I am in good shape now. Maybe someone can explain why it was not working with the extern keyword.

2
  • so dev_src has the correct values but sdata somehow is not getting dev_src values Commented Feb 12, 2012 at 0:51
  • Don't add updates to the comments. Update your question. Commented Feb 12, 2012 at 1:14

1 Answer 1

2

I think I know why this is happening as I have faced this before. How are you launching the kernel?

Remember in the launch kernel<<<blocks,threads,sharedMemory>>> the sharedMemory should be the size of the shared memory in bytes. So, if you are declaring for 512 elements, the third parameter should be 512 * sizeof(float4). I think you are just calling as below, which is wrong

kernel<<<blocks,threads,512>>>   // this is wrong
Sign up to request clarification or add additional context in comments.

1 Comment

To clarify this answer, since it is mostly correct: if you specify a shared variable as `extern shared" then you must specify the amount of shared memory (in bytes) to allocate at kernel launch using the third parameter of the execution configuration (the <<<>>>), as Programmer describes.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.