Why does transposing a numpy array rotate it 90 degrees?

Question

I am trying to read images from an lmdb dataset, augment each one and then save them into another dataset for being used in my trainings.
These images axis were initially changed to (3,32,32) when they were being saved into the lmdb dataset, So in order to augment them I had to transpose them back into their actual shape.
The problem is whenever I try to display them using either matplotlib's show() method or scipy's toimage(), they show a rotated version of the image. So we have :

img_set = np.transpose(data_train,(0,3,2,1))
#trying to display an image using pyplot, makes it look like this:  
plt.subplot(1,2,1)
plt.imshow(img_set[0])

showing the same image using toimage :

Now if I dont transpose data_train, pyplot's show() generates an error while toimage() displays the image well:

What is happening here?
When I feed the transposed data_train to my augmenter, I also get the result rotated just like previous examples.
Now I'm not sure whether this is a displaying issue, or the actual images are indeed rotated!
What should I do ?

kmario23 · Accepted Answer · 2017-04-05 18:27:17Z

10

First, look closely. The transoposed array is not rotated but mirrored on the diagonal (i.e. X and Y axes are swapped).

The original shape is (3,32,32), which I interpret as (RGB, X, Y). However, imshow expects an array of shape MxNx3 - the color information must be in the last dimension.

By transposing the array you invert the order of dimensions: (RGB, X, Y) becomes (Y, X, RGB). This is fine for matplotlib because the color information is now in the last dimension but X and Y are swapped, too. If you want to preserve the order of X, Y you can tell transpose to do so:

import numpy as np

img = np.zeros((3, 32, 64))  # non-square image for illustration

print(img.shape)  # (3, 32, 64)
print(np.transpose(img).shape)  # (64, 32, 3)
print(np.transpose(img, [1, 2, 0]).shape)  # (32, 64, 3)

When using imshow to display an image be aware of the following pitfalls:

It treats the image as a matrix, so the dimensions of the array are interpreted as (ROW, COLUMN, RGB), which is equivalent to (VERTICAL, HORIZONTAL, COLOR) or (Y, X, RGB).
It changes direction of the y axis so the upper left corner is img[0, 0]. This is different from matplotlib's normal coordinate system where (0, 0) is the bottom left.

Example:

import matplotlib.pyplot as plt

img = np.zeros((32, 64, 3))
img[1, 1] = [1, 1, 1]  # marking the upper right corner white

plt.imshow(img)

Note that the smaller first dimension corresponds to the vertical direction of the image.

edited Apr 5, 2017 at 18:27

kmario23

62.1k17 gold badges174 silver badges160 bronze badges

answered Apr 5, 2017 at 9:52

MB-F

23.8k5 gold badges71 silver badges127 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Hossein Over a year ago

Thanks alot, very well explained ;)

Collectives™ on Stack Overflow

Why does transposing a numpy array rotate it 90 degrees?

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related