OpenCL data dependency between kernels

Question

I need to write an OpenCL program for reducing a large buffer (several million floats) into a single float. For the simplicity of the question I will suppose here that I need to compute the sum of all floats.

So I have written a kernel which takes a float buffer as input, and sums it by packets of 64. It writes the result to a buffer which is 64 times smaller. I then iterate the call of this kernel until the data is small enough to be copied back on the host and summed by the CPU.

I'm new to OpenCL, do I need to have a barrier between each kernel so that they are run sequentially, or is OpenCL smart enough to detect that the nth kernel pass is writing to an output buffer used as the input buffer of the n+1th kernel?

Or is there a smarter approach?

jprice · Accepted Answer · 2018-01-19 09:36:36Z

5

If you are using a single, in-order command queue for all of your kernel launches (i.e. you do not use the CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE property), then each kernel invocation will run to completion before the next begins - you do not need any explicit barriers to enforce this behaviour.

If you are using an out-of-order command queue or multiple queues, you can enforce data dependencies via the use of OpenCL events. Each call to clEnqueueNDRangeKernel can optionally return an event object, which can be passed to subsequent commands as dependencies.

answered Jan 19, 2018 at 9:36

jprice

9,9451 gold badge30 silver badges32 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

OpenCL data dependency between kernels

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related