Efficient way to compute a tensor

Question

Suppose c is a d-dimensional vector. I want to compute the following third-order tensor

where e_i stands for the ith standard basis of the Euclidean space. Is there a efficient way to compute this? I am using the following for-loop and the Kruskal-tensor ktensor to compute it using the tensor toolbox managed by Sandia National Labs:

x=ktensor({c,c,c});
I=eye(d);

for i=1:d
    x=x+2*c(i)*ktensor({I(:,i),I(:,i),I(:,i)}
end

for i=1:d
    for j=1:d

         x=x- c(i)*c(j)*(ktensor({I(:,i),I(:,i),I(:,j)})+ktensor({I(:,i),I(:,j),I(:,i)})+ktensor({I(:,i),I(:,j),I(:,j)}))


    end
end

One thing you could try is the Parallel Computing Toolbox (aka parfor). Since there is no dependency between different i and j indices, you could calculate each terms in the summation in parallel. — Yvon
– Yvon, Commented Jun 29, 2017 at 16:09
@Yvon: How would the modified code with parfor look like? Is it much faster compared to the above? I am guessing it might be $d vs d^2$ speed up or something. — pikachuchameleon
– pikachuchameleon, Commented Jun 29, 2017 at 19:38
I will give you a sample code in a minute. It's not optimizing the code, but merely let the cpu run the each iteration on a separate thread, so they can be done at the same time. — Yvon
– Yvon, Commented Jun 29, 2017 at 20:06

P-Gn · Accepted Answer · 2017-06-29 21:20:29Z

2

Here is a possibility.

I used an optimization for the second term, as it places values of c along the "diagonal" of the tensor.
For the first term, there isn't much room for optimization, as it is a dense multiplication, so bsxfun seems appropriate.
For the third term, I stick to bsxfun, but as the result is somewhat sparse, you may benefit from filling it "by hand" if the size of your matrix is large.

Here is the code:

dim = 10;
c = [1:dim]';
e = eye(dim);

x = zeros([dim, dim, dim]);
% initialize with second term
x(1:dim*(dim+1)+1:end) = 2 * c;
% add first term
x = x + bsxfun(@times, bsxfun(@times, c, shiftdim(c, -1)), shiftdim(c, -2));
% add third term
x = x - sum(sum(bsxfun(@times, shiftdim(c*c',-3), ...
   bsxfun(@times, bsxfun(@times, permute(e, [1, 3, 4, 2, 5]), permute(e, [3, 1, 4, 2, 5])), permute(e, [3, 4, 1, 5, 2])) +...
   bsxfun(@times, bsxfun(@times, permute(e, [1, 3, 4, 2, 5]), permute(e, [3, 1, 4, 5, 2])), permute(e, [3, 4, 1, 2, 5])) +...
   bsxfun(@times, bsxfun(@times, permute(e, [1, 3, 4, 5, 2]), permute(e, [3, 1, 4, 2, 5])), permute(e, [3, 4, 1, 2, 5]))), 5), 4);

EDIT

A much more efficient (esp. memory-wise) computation of the third term:

ec = bsxfun(@times, e, c);
x  = x - ...
  bsxfun(@times, ec, shiftdim(c, -2)) -...
  bsxfun(@times, c', reshape(ec, [dim, 1, dim])) -....
  bsxfun(@times, c, reshape(ec, [1, dim, dim]));

edited Jun 29, 2017 at 21:20

answered Jun 29, 2017 at 19:08

P-Gn

24.9k10 gold badges95 silver badges108 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

pikachuchameleon Over a year ago

Excellent, the edited solution is a significant speed up compared to my method!

Yvon · Accepted Answer · 2017-06-29 22:11:46Z

0

You could try the Parallel Computing Toolbox that is namely parfor loops.

x=ktensor({c,c,c});
I=eye(d);

y = zeros(d,d,d, d);
parfor i=1:d
    y(:,:,:, i) = 2*c(i)*ktensor({I(:,i),I(:,i),I(:,i)};
end
x = x + sum(y, 4);

z = zeros(d,d,d, d,d);
parfor i=1:d
    for j=1:d % only one layer of parallelization is allowed
         z(:,:,:, i,j) = c(i)*c(j)*(ktensor({I(:,i),I(:,i),I(:,j)})+ktensor({I(:,i),I(:,j),I(:,i)})+ktensor({I(:,i),I(:,j),I(:,j)}));
    end
end
x = x - sum(sum(z, 5), 4);
x % is your result

It just runs the untouched ktensor commands but in separate threads, so the Toolbox takes care of running the code in parallel.

Because of the independence property of each iteration, which means, for example, c_{i+1, j+1} does not rely on c_{i, j}, this is possible.

Depending on the number of cores (and hyperthreading) of your system, there could be up to #-of-cores-times of speed-up.

edited Jun 29, 2017 at 22:11

answered Jun 29, 2017 at 20:17

Yvon

3,0032 gold badges20 silver badges41 bronze badges

5 Comments

pikachuchameleon Over a year ago

The entry z(i,j) is initialized with zero but in the for-loops it stores a whole tensor?

Yvon Over a year ago

@pikachuchameleon oops.

Yvon Over a year ago

@pikachuchameleon So I got little knowledge from their webpage without actually getting the code. What is the dimension of variable x in your workspace? and what is d?

pikachuchameleon Over a year ago

x is a d*d*d tensor and d is some natural number. Suppose you have a tensor T=a*a*a+b*b*b, then the syntax T=ktensor({[a,b],[a,b],[a,b]} does exactly the same and it stands for Kruskal tensor.

Yvon Over a year ago

Edited answer. x is supposed to be a d x d x d matrix, y contains all dimensions of x in the first three subscripts, then the forth for the loop. z does the same thing but contains two loop indices.

Collectives™ on Stack Overflow

Efficient way to compute a tensor

2 Answers 2

1 Comment

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related