Translating C code to OpenCL

Question

I am trying to translate a smaller program written in C into openCL. I am supposed to transfer some input data to the GPU and then perform ALL calculations on the device using successive kernel calls.

However, I am facing difficulties with parts of the code that are not suitable for parallelization since I must avoid transferring data back and forth between CPU and GPU because of the amount of data used.

Is there a way to execute some kernels without the parallel processing so I can replace these parts of code with them? Is this achieved by setting global work size to 1?

Pragmateek · Accepted Answer · 2012-12-13 15:09:27Z

0

You could manage two devices :

the GPU for highly parallelized code
the CPU for sequential code

This is a bit complex as you must manage one command-queue by device to schedule each kernel on the appropriate device.

If the devices are part of the same platform (typically AMD) you can use the same context, otherwise you will have to create one more context for the CPU.

Moreover, if you want to have a more fine-grained CPU task-parallelization you could use device-fission if your CPU supports it.

edited Dec 13, 2012 at 15:09

answered Dec 13, 2012 at 14:47

Pragmateek

13.6k9 gold badges80 silver badges112 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

user1894442 Over a year ago

Thanks for your answer. I'm new to openCL so forgive me if this is a stupid question but if I have two contexts ( Intel CPU and nVidia GPU ) doesn't that mean that I still have to send data back and forth?

Pragmateek Over a year ago

You could use pinned-memory with memory-mappings : CPU will have direct access to the memory, and GPU may copy the necessary parts to its global memory. If you have a lot of data movements compared to processing you may have to adapt your algorithm. In the ideal case you'll be able to run CPU and GPU concurrently.

Kyle Lutz · Accepted Answer · 2014-07-14 02:10:37Z

0

Yes, you can execute code serially on OpenCL devices. To do this, write your kernel code the same as you would in C and then execute it with the clEnqueueTask() function.

edited Jul 14, 2014 at 2:10

answered Dec 13, 2012 at 14:44

Kyle Lutz

8,0562 gold badges23 silver badges24 bronze badges

Collectives™ on Stack Overflow

Translating C code to OpenCL

2 Answers 2

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related