I have application which is embarrassingly parallel. Is it possible to launch multiple CPU threads so that a thread manages a GPU?If it is possible, what threading library should I use on CPU side? OpenMP?Pthreads?
1 Answer
It is possible, but since Cuda 4.0 was released, unnecessary. The Cuda API is now thread safe, so you can asynchronously manage multiple devices using a single host thread.
If you really want to use multiple host threads, just about any host library will do. I have successfully used pthreads, boost::thread and Apple's grand central dispatch with Cuda on linux and OS X.
5 Comments
username_4567
So is it possible to manage ith CUDA device with ith thread so that ith CPU thread sends ith chunk of my data to ith device?
talonmies
Yes, nothing special is required. The only thing you need to be careful about is making sure each thread gets a unique GPU. For that, either have a master thread enumerate all the devices and assign and broadcast device IDs to each thread, or use the compute exclusive settings in the TCC/Linux driver and let the driver automagically assign a device to each thread.
username_4567
can you please elaborate compute exclusive settings in TCC?
talonmies
you can use the
-c option in nvidia-smi to set the GPU to compute exclusive mode, although I seem to remember that was the default anyway. There is/was a TCC driver whitepaper which explains everything about the driver and its settings.winterlight
What happens if each host thread does not get a unique GPU? Does it result in crash/undefined behavior?