1

I tried searching for an answer, but couldn't find what I needed. Apologies if this is a duplicate question.

Suppose I have a 2d-array with shape (n, n*m). What I want to do is an outer sum of this array to its transpose that results in an array with shape (n*m, n*m). For example, suppose i have

A = array([[1., 1., 2., 2.],
           [1., 1., 2., 2.]])

I want to do an outer sum of A and A.T such that the output is:

>>> array([[2., 2., 3., 3.],
           [2., 2., 3., 3.],
           [3., 3., 4., 4.],
           [3., 3., 4., 4.]])

Note that np.add.outer does not work because it ravels in the inputs into vectors. I can achieve something similar by doing

np.tile(A, (2, 1)) + np.tile(A.T, (1, 2))

but this does not seem reasonable when n and m are reasonably large (n > 100 and m > 1000). Is it possible to write this sum using einsum? I just can't seem to figure out einsum.

3
  • einsum implements sum of products; it can be used for outer products, but not outer sums. With the tile you make two temporary arrays that are the size of the target, right? Is the problem so large that you can only afford to have one array of that size? Commented Sep 13, 2018 at 21:06
  • Can't you make this more diagnostic by changing the 2nd row of A? Commented Sep 13, 2018 at 21:09
  • Stepping back from the example, and focusing on the dimensions. A can be reshaped to (n,n,m), and the target to (n,m,n,m). If we can create a sum that is (n,n,m,m) we can transpose to get the right shape. Will the values be right? Commented Sep 13, 2018 at 21:20

2 Answers 2

2

To leverage broadcasting, we need to break it down to 3D and then permute axes and add -

n = A.shape[0]
m = A.shape[1]//n
a = A.reshape(n,m,n) # reshape to 3D
out = (a[None,:,:,:] + a.transpose(1,2,0)[:,:,None,:]).reshape(n*m,-1)

Sample run for verification -

In [359]: # Setup input array
     ...: np.random.seed(0)
     ...: n,m = 3,4
     ...: A = np.random.randint(1,10,(n,n*m))

In [360]: # Original soln
     ...: out0 = np.tile(A, (m, 1)) + np.tile(A.T, (1, m))

In [361]: # Posted soln
     ...: n = A.shape[0]
     ...: m = A.shape[1]//n
     ...: a = A.reshape(n,m,n)
     ...: out = (a[None,:,:,:] + a.transpose(1,2,0)[:,:,None,:]).reshape(n*m,-1)

In [362]: np.allclose(out0, out)
Out[362]: True

Timings with large n,m -

In [363]: # Setup input array
     ...: np.random.seed(0)
     ...: n,m = 100,100
     ...: A = np.random.randint(1,10,(n,n*m))

In [364]: %timeit np.tile(A, (m, 1)) + np.tile(A.T, (1, m))
1 loop, best of 3: 407 ms per loop

In [365]: %%timeit
     ...: # Posted soln
     ...: n = A.shape[0]
     ...: m = A.shape[1]//n
     ...: a = A.reshape(n,m,n)
     ...: out = (a[None,:,:,:] + a.transpose(1,2,0)[:,:,None,:]).reshape(n*m,-1)
1 loop, best of 3: 219 ms per loop

Further performance boost with numexpr

We can leverage multi-core with numexpr module for large data and to gain memory efficiency and hence performance -

import numexpr as ne

n = A.shape[0]
m = A.shape[1]//n
a = A.reshape(n,m,n)
p1 = a[None,:,:,:]
p2 = a.transpose(1,2,0)[:,:,None,:]
out = ne.evaluate('p1+p2').reshape(n*m,-1)

Timings with same large n, m setup -

In [367]: %%timeit
     ...: # Posted soln
     ...: n = A.shape[0]
     ...: m = A.shape[1]//n
     ...: a = A.reshape(n,m,n)
     ...: p1 = a[None,:,:,:]
     ...: p2 = a.transpose(1,2,0)[:,:,None,:]
     ...: out = ne.evaluate('p1+p2').reshape(n*m,-1)
10 loops, best of 3: 152 ms per loop
Sign up to request clarification or add additional context in comments.

Comments

0

one way is

(A.reshape(-1,*A.shape).T+A)[:,0,:]

I think this will take a lot of memory with n>100 and m>1000.

but isn't this the same as

np.add.outer(A,A)[:,0,:].reshape(4,-1)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.