Easy way to collapse trailing dimensions of numpy array?

Question

In Matlab, I can do the following:

X = randn(25,25,25);
size(X(:,:))

ans = 
    25   625

I often find myself wanting to quickly collapse the trailing dimensions of an array, and do not know how to do this in numpy.

I know I can do this:

In [22]: x = np.random.randn(25,25,25)
In [23]: x = x.reshape(x.shape[:-2] + (-1,))
In [24]: x.shape
Out[24]: (25, 625)

but x.reshape(x.shape[:-2] + (-1,)) is a lot less concise (and requires more information about x) than simply doing x(:,:).

I've obviously tried the analogous numpy indexing, but that does not work as desired:

In [25]: x = np.random.randn(25,25,25)
In [26]: x[:,:].shape
Out[26]: (25, 25, 25)

Any hints on how to collapse the trailing dimensions of an array in a concise manner?

Edit: note that I'm after the resulting array itself, not just its shape. I merely use size() and x.shape in the above examples to indicate what the array is like.

On a side note, your example is incorrect... x.shape[:-2] would yield an empty tuple. (Adding -1 to it means the array would be "flattened" into a 15625-length array.) I'm guessing you meant x.shape[0]? — Joe Kington
– Joe Kington, Commented Jun 11, 2015 at 13:44
@JoeKington: It is correct (try it). x.shape[:-2] returns x.shape up to (but not including) the second-to-last element. So for a 3D-array x, it returns just the first element of x.shape. I used [:-2] rather than [0] because I'm looking for a general solution that works for all ND arrays where N>2. — EelkeSpaak
– EelkeSpaak, Commented Jun 11, 2015 at 13:48

hpaulj · Accepted Answer · 2015-06-11 21:49:29Z

What is supposed to happen with a 4d or higher?

octave:7> x=randn(25,25,25,25);
octave:8> size(x(:,:))
ans =
      25   15625

Your (:,:) reduces it to 2 dimensions, combining the last ones. The last dimension is where MATLAB automatically adds and collapses dimensions.

In [605]: x=np.ones((25,25,25,25))

In [606]: x.reshape(x.shape[0],-1).shape  # like Joe's
Out[606]: (25, 15625)

In [607]: x.reshape(x.shape[:-2]+(-1,)).shape
Out[607]: (25, 25, 625)

Your reshape example does something different from MATLAB, it just collapses the last 2. Collapsing it down to 2 dimensions like MATLAB is a simpler expression.

The MATLAB is concise simply because your needs match it's assumptions. The numpy equivalent isn't quite so concise, but gives you more control

For example to keep the last dimension, or combine dimensions 2 by 2:

In [608]: x.reshape(-1,x.shape[-1]).shape
Out[608]: (15625, 25)
In [610]: x.reshape(-1,np.prod(x.shape[-2:])).shape
Out[610]: (625, 625)

What's the equivalent MATLAB?

octave:24> size(reshape(x,[],size(x)(2:end)))
ans =
15625      25
octave:31> size(reshape(x,[],prod(size(x)(3:end))))

Joe Kington · Accepted Answer · 2015-06-11 13:38:54Z

2

You might find it a bit more concise to modify the shape attribute directly. For example:

import numpy as np

x = np.random.randn(25, 25, 25)
x.shape = x.shape[0], -1

print x.shape
print x

This is functionally equivalent to reshape (in the sense of data ordering, etc). Obviously, it still requires the same information about x's shape, but it is a more concise way of handling the reshape.

answered Jun 11, 2015 at 13:38

Joe Kington

287k73 gold badges621 silver badges474 bronze badges

2 Comments

Jaime Over a year ago

It has the added "benefit" that, if the reshaping cannot be done without a copy, it will raise an error, so I often use it as a form of asserting that my code is efficiently using memory.

tjysdsg Over a year ago

why not x.shape = *(x.shape[:-2]), -1 in case x has dimension higher than 3

Kasravnd · Accepted Answer · 2015-06-11 12:29:00Z

1

You can use np.hstack :

>>> np.hstack(x).shape
(25, 625)

np.hstack ake a sequence of arrays and stack them horizontally to make a single array.

answered Jun 11, 2015 at 12:29

Kasravnd

108k19 gold badges167 silver badges195 bronze badges

4 Comments

EelkeSpaak Over a year ago

That's an interesting use of np.hstack! However, it does not yield the same result as x.reshape(x.shape[:-2] + (-1,)), so should be used with caution. Thinking about it a bit further, what I believe is going on is that np.hstack treats the first dimension of the array as the iterable dimension, and thus returns the same as np.hstack(x[0,:,:],x[1,:,:],...). I need functionality that leaves the first dimension untouched (i.e. all data previously in x[0,:,:] is now in x[0,:]), as is the case for the Matlab version.

Kasravnd Over a year ago

Do you want a new array or just want its shape?

EelkeSpaak Over a year ago

I'm after the contents of the array itself, the shape is just for illustration. I edited the original question to make this more clear.

Kasravnd Over a year ago

@EelkeSpaak so can you add a minimal example of your expected result?

Collectives™ on Stack Overflow

Easy way to collapse trailing dimensions of numpy array?

3 Answers 3

Comments

2 Comments

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

2 Comments

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related