1

I have a numpy array of the shape (10, 10, 10, 60). The dimensions could be arbitrary but this just an example.

I want to reduce this to an array of (10, 10, 10, 20) by taking the mean over some subsets I have two scenarios:

1: Take the mean of every (10, 10, 10, 20) block i.e. have three (10, 10, 10, 20) block and take the mean between the three. This can be done with: m = np.mean((x[..., :20], x[..., 20:40], x[...,40:60]), axis=3). My question is how can I generate this when the last dimension is arbitrary without writing some explicit loop? So, I can do something like:

x = np.random.rand(10, 10, 10, 60)
result = np.zeros((10, 10, 10, 20))
offset = 20
loops = x.shape[3] // offset
for i in range(loops):
    index = i * offset
    result += x[..., index:index+offset]
result = result / loops

However, this does not seem too pythonic and I was wondering if there is a more elegant way to do this.

2: Another scenario is that I want to break it down into 10 arrays of the shape (10, 10, 10, 2, 3) and then take the mean along the 5th dimension between these ten arrays and then reshape this to (10, 10, 10, 20) array as original planned. I can reshape the array and then again take the average as done previously and reshape again but that second part seems quite inelegant.

2
  • For part2 : Could you elaborate on the breaking down part? Or give us a working loopy implementation? Commented Jan 26, 2017 at 13:41
  • Actually your solution would work for both the parts, it seems! I just need to reshape differently, I think. Let me try! Commented Jan 26, 2017 at 13:42

1 Answer 1

1

You could reshape splitting the last axis into two, such that the first one has the length as the number of blocks needed and then get the average/mean along the second last axis -

m,n,r = x.shape[:3]
out = x.reshape(m,n,r,3,-1).mean(axis=-2) # 3 is no. of blocks

Alternatively, we could introduce np.einsum for noticeable performance boost -

In [200]: x = np.random.rand(10, 10, 10, 60)

In [201]: %timeit x.reshape(m,n,r,3,-1).mean(axis=-2)
1000 loops, best of 3: 430 µs per loop

In [202]: %timeit np.einsum('ijklm->ijkm',x.reshape(m,n,r,3,-1))/3.0
1000 loops, best of 3: 214 µs per loop
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.