CUDA and MATLAB for loop optimization

Question

I'm going to attempt to optimize some code written in MATLAB, by using CUDA. I recently started programming CUDA, but I've got a general idea of how it works.

So, say I want to add two matrices together. In CUDA, I could write an algorithm that would utilize a thread to calculate the answer for each element in the result matrix. However, isn't this technique probably similar to what MATLAB already does? In that case, wouldn't the efficiency be independent of the technique and attributable only to the hardware level?

Might be interesting to compare the solution from The MathWorks with 3rd party tools and hand crafted CUDA. developer.nvidia.com/object/matlab_cuda.html — zellus
– zellus, Commented Dec 9, 2010 at 22:12

Jose Vega · Accepted Answer · 2010-12-09 21:53:47Z

3

The technique might be similar but remember with CUDA you have hundreds of threads running simultaneously. If MATLAB is using threads and those threads are running on a Quad core, you are only going to get 4 threads excuted per clock cycle while you might achieve a couple of hundred threads to run on CUDA with that same clock cycle.

So to answer you question, YES, the efficiency in this example is independent of the technique and attributable only to the hardware.

answered Dec 9, 2010 at 21:53

Jose Vega

10.3k7 gold badges43 silver badges58 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

dnbwise Over a year ago

I wouldn't be at all surprised to see a speedup - in fact, expect it given the input size is worth the overhead. But, my point is that the algorithm itself (i.e. computing the addition for each element in parallel) doesn't contribute to the speedup, independent of the hardware.

Marm0t · Accepted Answer · 2010-12-09 22:20:59Z

1

The answer is unequivocally yes, all the efficiencies are hardware level. I don't how exactly matlab works, but the advantage of CUDA is that mutltiple threads can be executed simultaneously, unlike matlab.

On a side note, if your problem is small, or requires many read write operations, CUDA will probably only be an additional headache.

answered Dec 9, 2010 at 22:20

Marm0t

9419 silver badges18 bronze badges

1 Comment

dnbwise Over a year ago

Presumably, MATLAB makes use of multiple threads at the virtual machine level.

bla · Accepted Answer · 2012-08-06 20:19:50Z

0

CUDA has official support for matlab.

[need link]

You can make use of mex files to run on GPU from MATLAB.

The bottleneck is the speed at which data is transfered from CPU-RAM to GPU. So if the transfer is minimized and done in large chunks, the speedup is great.

edited Aug 6, 2012 at 20:19

bla

26.1k11 gold badges72 silver badges105 bronze badges

answered Dec 11, 2010 at 5:36

Prasanna

1052 silver badges9 bronze badges

Comments

Josep · Accepted Answer · 2012-09-27 13:14:53Z

0

For simple things, it's better to use the gpuArray support in the Matlab PCT. You can check it here http://www.mathworks.de/de/help/distcomp/using-gpuarray.html

For things like adding gpuArrays, multiplications, mins, maxs, etc., the implementation they use tends to be OK. I did find out that for making things like batch operations of small matrices like abs(y-Hx).^2, you're better off writing a small Kernel that does it for you.

answered Sep 27, 2012 at 13:14

Josep

5931 gold badge5 silver badges13 bronze badges

Collectives™ on Stack Overflow

CUDA and MATLAB for loop optimization

4 Answers 4

1 Comment

1 Comment

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

1 Comment

1 Comment

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related