Passing char array to CUDA Kernel

Question

I am trying to pass an char array containing 10000 words read from a txt file in the main function to CUDA kernel function.

The words are transferred from the host to device like this:

(main function code:)

//.....
     const int text_length = 20;

     char (*wordList)[text_length] = new char[10000][text_length];
     char *dev_wordList;

     for(int i=0; i<number_of_words; i++)
     {
         file>>wordList[i];
         cout<<wordList[i]<<endl;
     }

     cudaMalloc((void**)&dev_wordList, 20*number_of_words*sizeof(char));
     cudaMemcpy(dev_wordList, &(wordList[0][0]), 20 * number_of_words * sizeof(char), cudaMemcpyHostToDevice);

    //Setup execution parameters
    int n_blocks = (number_of_words + 255)/256;
    int threads_per_block = 256;


    dim3 grid(n_blocks, 1, 1);
    dim3 threads(threads_per_block, 1, 1);


    cudaPrintfInit();
    testKernel<<<grid, threads>>>(dev_wordList);
    cudaDeviceSynchronize();
    cudaPrintfDisplay(stdout,true);
    cudaPrintfEnd();

(kernel function code:)

__global__ void testKernel(char* d_wordList)
{
    //access thread id
    const unsigned int bid = blockIdx.x;
    const unsigned int tid = threadIdx.x;
    const unsigned int index = bid * blockDim.x + tid;

    cuPrintf("!! %c%c%c%c%c%c%c%c%c%c \n" , d_wordList[index * 20 + 0],
                                            d_wordList[index * 20 + 1],
                                            d_wordList[index * 20 + 2],
                                            d_wordList[index * 20 + 3],
                                            d_wordList[index * 20 + 4],
                                            d_wordList[index * 20 + 5],
                                            d_wordList[index * 20 + 6],
                                            d_wordList[index * 20 + 7],
                                            d_wordList[index * 20 + 8],
                                            d_wordList[index * 20 + 9]);
}

Is there a way to manipulate them easier? (I would like to have a word per element/position) I tried with <string>, but I can't use them in CUDA device code.

Ashalynd · Accepted Answer · 2014-09-24 08:39:19Z

1

cuPrintf("%s\n", d_wordlist+(index*20));

should work? (provided your strings are zero-terminated)

Update:

This line:

char (*wordList)[text_length] = new char[10000][text_length];

looks strange to me. In general, array of pointers to char would be allocated like this:

char** wordList = new char*[10000];
for (int i=0;i<10000;i++) wordList[i] = new char[20];

In this case, wordList[i] would be a pointer to string number i.

Update #2:

If you need to store your strings as a consecutive block, and you are sure that none of your strings exceeds text_length+1, then you can do like that:

char *wordList = new char[10000*text_length];

for(int i=0; i<number_of_words; i++)
     {
         file>>wordList+(i*text_length);
         cout<<wordList+(i*text_length)<<endl;
     }

In that case, wordList + (i*text_length) will point to the beginning of your string number i, and it will be 0-terminated because that's how you read it from the file, and you will be able to print it out with the way specified in this answer. If any of your strings is longer than text_length-1, however, you will still get issues.

edited Sep 24, 2014 at 8:39

answered Sep 24, 2014 at 7:58

Ashalynd

12.6k2 gold badges36 silver badges38 bronze badges

Sign up to request clarification or add additional context in comments.

11 Comments

Alex Iacob Over a year ago

I tried but I get a sequence of strange characters.

Ashalynd Over a year ago

The way you copy your strings in the code, it looks like they are not zero-terminated (you allocate 20 symbols per string and do memcpy i/o strcpy). Would it be possible to allocate 21 symbol per string and add '\0' after each string?

Alex Iacob Over a year ago

Yes, but I'm not sure how to do it. It would be better if the terminator would be after each string.

Ashalynd Over a year ago

are all your words 20 symbols or you just estimate them to be so?

Alex Iacob Over a year ago

The transfer of strings from host to device is correct, the problem is with the cuPrintf call, I tried to output a string declared in the kernel function and it only outputs the string if it's declared as const char*. So I modified the kernel function parameter declaration from global void testKernel(char* d_wordList) to global void testKernel(const char* d_wordList) and now it work's. Thanks a lot!

|

Collectives™ on Stack Overflow

Passing char array to CUDA Kernel

1 Answer 1

11 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

11 Comments

Your Answer

Sign up or log in

Post as a guest

Related