CUDA - Array Generating random array on gpu and its modification using kernel

Question

in this code im generating 1D array of floats on a gpu using CUDA. The numbers are between 0 and 1. For my purpose i need them to be between -1 and 1 so i have made simple kernel to multiply each element by 2 and then substract 1 from it. However something is going wrong here. When i print my original array into .bmp i get this https://i.sstatic.net/tpdhn.png (typical noise pattern). But when i try to modify that array with my kernel i get blank black picture https://i.sstatic.net/tisIp.jpg . The program is executable but in the debug i get this:

First-chance exception at 0x75f0c41f in Midpoint_CUDA_Alpha.exe: Microsoft C++ exception: cudaError_enum at memory location 0x003cfacc..

First-chance exception at 0x75f0c41f in Midpoint_CUDA_Alpha.exe: Microsoft C++ exception: cudaError_enum at memory location 0x003cfb08..

First-chance exception at 0x75f0c41f in Midpoint_CUDA_Alpha.exe: Microsoft C++ exception: [rethrow] at memory location 0x00000000..

i would be thankfull for any help or even little hint in this matter. Thanks ! (edited)

#include <device_functions.h>
#include <time.h>
#include <stdio.h>
#include <stdlib.h>
#include "stdafx.h"
#include "EasyBMP.h"
#include <curand.h> //curand.lib must be added in project propetties > linker > input
#include "device_launch_parameters.h"

float *heightMap_cpu;
float *randomArray_gpu;
int randCount = 0;
int rozmer = 513;

void createRandoms(int size){
    curandGenerator_t generator;
    cudaMalloc((void**)&randomArray_gpu, size*size*sizeof(float));
    curandCreateGenerator(&generator,CURAND_RNG_PSEUDO_XORWOW);
    curandSetPseudoRandomGeneratorSeed(generator,(int)time(NULL));
    curandGenerateUniform(generator,randomArray_gpu,size*size);
}

__global__ void polarizeRandoms(int size, float *randomArray_gpu){
    int index = threadIdx.x + blockDim.x * blockIdx.x;
    if(index<size*size){
        randomArray_gpu[index] = randomArray_gpu[index]*2.0f - 1.0f;
    }
}

//helper fucnction for getting address in 1D using 2D coords
int ad(int x,int y){
    return x*rozmer+y;
}

void printBmp(){
    BMP AnImage;
    AnImage.SetSize(rozmer,rozmer);
    AnImage.SetBitDepth(24);
    int i,j;
    for(i=0;i<=rozmer-1;i++){
        for(j=0;j<=rozmer-1;j++){
            AnImage(i,j)->Red = (int)((heightMap_cpu[ad(i,j)]*127)+128);
            AnImage(i,j)->Green = (int)((heightMap_cpu[ad(i,j)]*127)+128);
            AnImage(i,j)->Blue = (int)((heightMap_cpu[ad(i,j)]*127)+128);
            AnImage(i,j)->Alpha = 0;
        }
    }
    AnImage.WriteToFile("HeightMap.bmp");
}

int main(){
    createRandoms(rozmer);
    polarizeRandoms<<<((rozmer*rozmer)/1024)+1,1024>>>(rozmer,randomArray_gpu);
    heightMap_cpu = (float*)malloc((rozmer*rozmer)*sizeof(float));
    cudaMemcpy(heightMap_cpu,randomArray_gpu,rozmer*rozmer*sizeof(float),cudaMemcpyDeviceToHost);
    printBmp();

    //cleanup
    cudaFree(randomArray_gpu);
    free(heightMap_cpu);
    return 0;
}

You might have some issues with float/integer conversion, try replacing the computation in your kernel with: randomArray_gpu[index] = randomArray_gpu[index]*2.0f - 1.0f; — Reguj
– Reguj, Commented Sep 10, 2013 at 20:28
Because int is promoted to float before * and +, see Implicit type conversion rules in C++ operators. — Vitality
– Vitality, Commented Sep 10, 2013 at 20:38
thx Jack, but it is not it. Even when I leave the line written like this: randomArray_gpu[index] = randomArray_gpu[index]; which should just leave the array as it was, the result is the same blank black (all zeroes). — Tomus
– Tomus, Commented Sep 10, 2013 at 21:26
@TomášTomusJavorský: Please do not "fix" code in your questions. The whole point of Stack Overflow is to leave a question and its answer for the next person who comes along. By "fixing" code, you are effectively destroying the question and making the answers you have received invalid. I have rolled back your edits and corrected the title as you wanted. Please don't edit the code again. — talonmies
– talonmies, Commented Sep 11, 2013 at 11:43

Community · Accepted Answer · 2017-05-23 11:50:10Z

3

This is wrong:

cudaMalloc((void**)&randomArray_gpu, size*size*sizeof(float));

We don't use cudaMalloc with __device__ variables. If you do proper cuda error checking I'm pretty sure that line will throw an error.

If you really want to use a __device__ pointer this way, you need to create a separate normal pointer, cudaMalloc that, then copy the pointer value to the device pointer using cudaMemcpyToSymbol:

float *my_dev_pointer;
cudaMalloc((void**)&my_dev_pointer, size*size*sizeof(float));
cudaMemcpyToSymbol(randomArray_gpu, &my_dev_pointer, sizeof(float *));

Whenever you are having trouble with your CUDA programs, you should do proper cuda error checking. It will likely focus your attention on what is wrong.

And, yes, kernels can access __device__ variables without the variable being passed explicitly as a parameter to the kernel.

The programming guide covers the proper usage of __device__ variables and the api functions that should be used to access them from the host.

edited May 23, 2017 at 11:50

CommunityBot

11 silver badge

answered Sep 11, 2013 at 5:03

Robert Crovella

154k12 gold badges254 silver badges300 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Tomus Over a year ago

thx, again, fixed now. And yes this was allso in the debug output.

Vitality Over a year ago

Definitely right. It was too late yesterday to have enough neurons working to realize that there was a __device__ keywork in front of the pointer declaration :-)

Collectives™ on Stack Overflow

CUDA - Array Generating random array on gpu and its modification using kernel

1 Answer 1

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related