How to invoke CUDA from C#

Question

I've built a program using Hybridizer to write CUDA code in C# and call the functions. The program is functional but I noticed that the overhead of setting up the GPU and calling the function to it is extremely high. For example, a job which took 3000 ticks when run on the CPU took about 50 million ticks to set up the GPU wrapper then another 50 million ticks to run when doing it on the GPU. I'm trying to figure out if this lag is due to Hybridizer itself or is simply unavoidable when calling GPU code from my C# program.

So I'm looking for alternative methods. My searches have found some mentions of something called P/invoke, but I can't really find a good guide on how to use it and all of those threads are 9+ years old so I don't know if their information is still relevant. I also found something about ManagedCuda but it seems that is no longer in development.

Denis Gladkiy · Accepted Answer · 2020-06-25 03:28:28Z

6

You can try CppSharp to generate C# bindings to CUDA. We were able to initialize CUDA with this approach and call it's simple hardware info functions (GetDeviceProperties, CudaSetDevice, CudaGetDeviceCount, CudaDriverGetVersion, CudaRuntimeGetVersion).

Usage of the other parts of CUDA API seems to be possible but we did not try: CppSharp generated bindings for the whole CUDA runtime API. We use CUDA indirectly via NVIDIA's Flex library. All the Flex functions are usable via CppSharp without considerable penalties.

The example usage of classes generated via CppSharp looks like this:

int driverVersion = 0;
CudaRuntimeApi.CudaDriverGetVersion(ref driverVersion);

int runtimeVersion = 0;
CudaRuntimeApi.CudaRuntimeGetVersion(ref runtimeVersion);

int deviceCount = 0;
var errorCode = CudaRuntimeApi.CudaGetDeviceCount(ref deviceCount);

if (errorCode != CudaError.CudaSuccess)
{
    Console.Error.WriteLine("'cudaGetDeviceCount' returned " + errorCode + ": " + CudaRuntimeApi.CudaGetErrorString(errorCode));
    return;
}

for (var device = 0; device < deviceCount; ++device)
{
    using (var deviceProperties = new CudaDeviceProp()) 
    {
        CudaRuntimeApi.CudaGetDeviceProperties(deviceProperties, device);
    }
}

CudaRuntimeApi and CudaDeviceProp are the classes generated by CppSharp.

edited Jun 25, 2020 at 3:28

answered Jun 24, 2020 at 3:09

Denis Gladkiy

2,1742 gold badges29 silver badges43 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

Robert Crovella Over a year ago

why not provide an example?

Denis Gladkiy Over a year ago

@Robert Crovella, CppSharp is a tool for bindings generation. Are you asking for scripts invoking it? Or examples of generated code? The code initializing CUDA is textually the same as in C++.

Robert Crovella Over a year ago

The C# part. You could simply demonstrate how to run a sample code like deviceQuery from C#. The CUDA code used as an example isn't that important, but it would be nice to see something complete, that works. I provide lots of fully worked examples in my answers, even ones that include things like OpenMP and calling CUDA code from python. Nobody charges you by the word or character to post here, so extreme brevity isn't really an attractive feature in an SO answer, in my opinion. Here is an example of calling CUDA from python using ctypes.

Denis Gladkiy Over a year ago

@Robert Crovella, oh, I see. I'll try to post the code.

Robert Crovella Over a year ago

On SO, I think it's pretty universally agreed that we like code.

|

Collectives™ on Stack Overflow

How to invoke CUDA from C#

1 Answer 1

6 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

6 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related