Making a for-loop in Matlab faster by using arrayfun?

Question

currently I have the following portion of code:

for i = 2:N-1
  res(i) = k(i)/m(i)*x(i-1) -(c(i)+c(i+1))/m(i)*x(N+i) +e(i+1)/m(i)*x(i+1);
end

where as the variables k, m, c and e are vectors of size N and x is a vector of size 2*N. Is there any way to do this a lot faster using something like arrayfun!? I couldn't figure this out :( I especially want to make it faster by running on the GPU later and thus, arrayfun would be also helpfull since matlab doesn't support parallelizing for-loops and I don't want to buy the jacket package... Thanks a lot!

Christoph · Accepted Answer · 2011-06-21 13:17:07Z

6

You don't have to use arrayfun. It works if use use some smart indexing:

    clear all

    N=20;
    k=rand(N,1);
    m=rand(N,1);
    c=rand(N,1);
    e=rand(N,1);
    x=rand(2*N,1);

    % for-based implementation
    %Watch out, you are not filling the first element of forres!
    forres=zeros(N-1,1); %Initialize array first to gain some speed.
    for i = 2:N-1
      forres(i) = k(i)/m(i)*x(i-1) -(c(i)+c(i+1))/m(i)*x(N+i) +e(i+1)/m(i)*x(i+1);
    end

    %vectorized implementation
    parres=k(2:N-1)./m(2:N-1).*x(1:N-2) -(c(2:N-1)+c(3:N))./m(2:N-1).*x(N+2:2*N-1) +e(3:N)./m(2:N-1).*x(3:N);

    %compare results; strip the first element from forres
    difference=forres(2:end)-parres %#ok<NOPTS>

answered Jun 21, 2011 at 13:17

Christoph

5,6526 gold badges38 silver badges64 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

tim Over a year ago

I assume a (x(i)-x(i-1))^3 gets to (x(2:N-1)-x(1:N-2)).^3?

Christoph Over a year ago

yes, you got it. the strategy is to replace the running index i with the range of indices in the for loop. i.e. for 1:N, i becomes 1:N, i+1 becomes 1+1:N+1.

Edric Over a year ago

The same indexing "tricks" are basically what's required to use arrayfun. Using arrayfun on the GPU is typically much faster than a sequence of vectorized operations.

Edric · Accepted Answer · 2011-06-21 11:21:42Z

5

Firstly, MATLAB does support parallel for loops via PARFOR. However, that doesn't have much chance of speeding up this sort of computation since the amount of computation is small compared to the amount of data you're reading and writing.

To restructure things for GPUArray "arrayfun", you need to make all the array references in the loop body refer to the loop iterate, and have the loop run across the full range. You should be able to do this by offsetting some of the arrays, and padding with dummy values. For example, you could prepend all your arrays with NaN, and replace x(i-1) with a new variable x_1 = [x(2:N) NaN]

answered Jun 21, 2011 at 11:21

Edric

25.3k2 gold badges42 silver badges43 bronze badges

7 Comments

tim Over a year ago

Okay I wanted to try it: But the problem I'm seeing here is that when creating so many 'dummy' variables, I probably run out of memory quite soon and further, the copying to the GPU and the memory access to all the different variables will probably be pretty slow!

Edric Over a year ago

You're quite right, you may have problems with that. Unfortunately, arrayfun is quite inflexible in what it allows. If you really wanted ultimate speed, you could write a small amount of CUDA code and use parallel.gpu.CUDAKernel in MATLAB to execute it.

tim Over a year ago

Yeah thanks, I already thought about this but just wanted a 'quick' solution and what user599884 suggested, helped me to create such a quick solution to make the for-loop faster. Do you think that a CUDA written code will really speed things up compared with user599884 solution running on the GPU by simply putting x = gpuArray(x), k=gpuArray(k) etc..

Edric Over a year ago

Yes, on the GPU either arrayfun or a hand-written kernel will generally beat vectorized code. You can roughly estimate how much by ignoring the index offsets and using arrayfun anyway to see how fast that is.

tim Over a year ago

Edric, do you think it's also worth doing it completely as CUDA-mex-file which does everything in the mex file (like copying the data to the GPU etc...) or is a PTX kernel sufficient?! I wonder if there will be a big difference between those both variants.

|

Collectives™ on Stack Overflow

Making a for-loop in Matlab faster by using arrayfun?

2 Answers 2

3 Comments

7 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

7 Comments

Your Answer

Sign up or log in

Post as a guest

Related