0

Whenever I run my script that contains those lines:

   char ** gpu_reads;
   HANDLE_ERROR(cudaMalloc((void **)&gpu_reads, inputDim * sizeof(char *)));
   for(i=0; i<inputDim; i++) {
      HANDLE_ERROR(cudaMalloc((void **)&(gpu_reads[i]), (READS_LENGTH + 1) * sizeof(char)));
   }
   for(i=0; i<inputDim; i++) {
      HANDLE_ERROR(cudaMemcpy(gpu_reads[i], reads[i], sizeof(char) * (READS_LENGTH + 1), cudaMemcpyHostToDevice));
   }

The second line returns an "unknown error". I tried different allocation in my program (this was the first one) but none of them worked.

The purpose of those lines is simply allocating an array (of length fixed by user, using the variable inputDim of strings (of fixed length).

I tried different version (i.e. using only 3 pointers, 1 pointer...) but none seems to work...

Any ideas?

The full code is available at my GitHub repository, where many allocation are made.

3
  • Why do have the same number of stars on the 4th line as the 2nd in (void **)? Commented Jun 9, 2016 at 18:41
  • CUDA is not C, but C++ based. Commented Jun 9, 2016 at 19:00
  • Allocate (READS_LENGTH * inputDim) bytes in a single chunk and you will never have to struggle with broken loops. Commented Jun 10, 2016 at 6:28

1 Answer 1

3

What you are trying to do cannot work because your code attempts to access memory you have allocated on the device from the host. You cannot access the elements of gpu_readson the host because it is not a valid host memory allocation.

Do something like this instead:

   char ** gpu_reads;
   char ** gpu_reads_h = new char*[input_dim];
   HANDLE_ERROR(cudaMalloc((void **)&gpu_reads, inputDim * sizeof(char *)));
   for(i=0; i<inputDim; i++) {
      HANDLE_ERROR(cudaMalloc((void **)&(gpu_reads_h[i]), (READS_LENGTH + 1) * sizeof(char)));
   }
   for(i=0; i<inputDim; i++) {
      HANDLE_ERROR(cudaMemcpy(gpu_reads_h[i], reads[i], sizeof(char) * (READS_LENGTH + 1), cudaMemcpyHostToDevice));
   }

   HANDLE_ERROR(cudaMemcpy(gpu_reads, gpu_reads_h, inputDim * sizeof(char *), cudaMemcpyHostToDevice);

i.e. build a copy of the eventual device array of pointers in host memory first, then copy it to the device.

Sign up to request clarification or add additional context in comments.

4 Comments

I tried this code, but I have the same "unknown error"... I also tried to allocate with (READS_LENGTH * inputDim) like said @Drop but it doesn't work either
@Cordaz: If you are getting that error then you have another problem unrelated to the memory allocation and copying, like a non-functioning CUDA installation, or prior broken code which you haven't shown us. Here is a complete example from your code snippet which works perfectly for me. Try it for yourself
I tried your code and I get the same error... I installed CUDA 7.5 o Ubuntu 16.04. I tried running it with sudo, and it returned this line: modprobe: FATAL: Module nvidia-uvm not found in directory /lib/modules/4.4.0-22-generic I already tried to install this package, but after the reboot the system started in low graphic mode and I had to restore previous Nvidia driver...
OK so you have a broken CUDA installation. Stack Overflow isn't the correct place to get help with that, I am afraid.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.