Computing on variable length arrays in OpenCL

Question

I am using OpenCL (Xcode, Intel GPU), and I am trying to implement a kernel that calculates moving averages and deviations. I want to pass several double arrays of varying lengths to the kernel. Is this possible to implement, or do I need to pad smaller arrays with zeroes so all the arrays are the same size?

I am new to OpenCL and GPGPU, so please forgive my ignorance of any nomenclature.

DarkZeros · Accepted Answer · 2016-12-16 11:38:21Z

2

You can pass to the kernel any buffer, the kernel does not need to use it all. For example, if your kernel reduces a buffer you can query at run time the amount of work items (items to reduce) using get_global_size(0). And then call the kernel with the proper parameters.

An example (unoptimized):

__kernel reduce_step(__global float* data)
{
    int id = get_global_id(0);
    int size = get_global_size(0);
    int size2 = size/2;
    int size2p = (size+1)/2;
    if(id<size2) //Only reduce up to size2, the odd element will remain in place
       data[id] += data[id+size2p];
}

Then you can call it like this.

void reduce_me(std::vector<cl_float>& data){
    size_t size = data.size();

    //Copy to a buffer already created, equal or bigger size than data size
    // ... TODO, check sizes of buffer or change the buffer set to the kernel args.
    queue.enqueueWriteBuffer(buffer,CL_FALSE,0,sizeof(cl_float)*size,data.data());

    //Reduce until 1024
    while(size > 1024){
        queue.enqueueNDRangeKernel(reduce_kernel,cl::NullRange,cl::NDRange(size),cl::NullRange);
        size /= 2; 
    }

    //Read out and trim
    queue.enqueueReadBuffer(buffer,CL_TRUE,0,sizeof(cl_float)*size,data.data());
    data.resize(size);
}

answered Dec 16, 2016 at 11:38

DarkZeros

8,4201 gold badge31 silver badges37 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Free Url Over a year ago

so I can't really easily tell from khronos specs, but does the get_global_size(n) get element n from the globalWorkSize [ ]? And how do you use the get_global_size() function to accomplish different things? Any tutorials or examples you know that would help.My ideal situation would be to pass a large constant of arrays, each containing variable length arrays. I had the idea to standardize the arrays' length by padding with zeroes and fold it into one extremely large array, then use modulus operator in kernel to derive where arrays change. But this seems expensive to memory? Thanks

DarkZeros Over a year ago

You don't need to pass all the data in asingle array you can separate it in different arrays if needed. If you need to perform different operation on different arrays then separate them in diferent kernels. Passing 20 buffers with 20 different lengths to do 20 different processes in a single kernel is non only hard to do but unoptimal, since you will have many code paths in order to achieve it. What I depict in my code is an example kernel that does a process to a buffer, the size is known at execution time and there is no branching.

Free Url Over a year ago

Well I am doing moving averages / deviations, for 5 different time periods, to probably about a billion data points. I was trying to come up with some clever ways to avoid having to call the kernel for every data point (as the averages are based on previous data points), as I am sure this has decent overhead price. Like I said my thought was to fold up a number of examples each into the different time period arrays, and then execute with integer div. I will test performance to see. Was also thinking I could do something with work groups / dimensions / whatever, but I'm not yet educated enough

Collectives™ on Stack Overflow

Computing on variable length arrays in OpenCL

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related