5

I'm going to attempt to optimize some code written in MATLAB, by using CUDA. I recently started programming CUDA, but I've got a general idea of how it works.

So, say I want to add two matrices together. In CUDA, I could write an algorithm that would utilize a thread to calculate the answer for each element in the result matrix. However, isn't this technique probably similar to what MATLAB already does? In that case, wouldn't the efficiency be independent of the technique and attributable only to the hardware level?

1

4 Answers 4

3

The technique might be similar but remember with CUDA you have hundreds of threads running simultaneously. If MATLAB is using threads and those threads are running on a Quad core, you are only going to get 4 threads excuted per clock cycle while you might achieve a couple of hundred threads to run on CUDA with that same clock cycle.

So to answer you question, YES, the efficiency in this example is independent of the technique and attributable only to the hardware.

Sign up to request clarification or add additional context in comments.

1 Comment

I wouldn't be at all surprised to see a speedup - in fact, expect it given the input size is worth the overhead. But, my point is that the algorithm itself (i.e. computing the addition for each element in parallel) doesn't contribute to the speedup, independent of the hardware.
1

The answer is unequivocally yes, all the efficiencies are hardware level. I don't how exactly matlab works, but the advantage of CUDA is that mutltiple threads can be executed simultaneously, unlike matlab.

On a side note, if your problem is small, or requires many read write operations, CUDA will probably only be an additional headache.

1 Comment

Presumably, MATLAB makes use of multiple threads at the virtual machine level.
0

CUDA has official support for matlab.

[need link]

You can make use of mex files to run on GPU from MATLAB.

The bottleneck is the speed at which data is transfered from CPU-RAM to GPU. So if the transfer is minimized and done in large chunks, the speedup is great.

Comments

0

For simple things, it's better to use the gpuArray support in the Matlab PCT. You can check it here http://www.mathworks.de/de/help/distcomp/using-gpuarray.html

For things like adding gpuArrays, multiplications, mins, maxs, etc., the implementation they use tends to be OK. I did find out that for making things like batch operations of small matrices like abs(y-Hx).^2, you're better off writing a small Kernel that does it for you.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.