3

I am trying to read images from an lmdb dataset, augment each one and then save them into another dataset for being used in my trainings.
These images axis were initially changed to (3,32,32) when they were being saved into the lmdb dataset, So in order to augment them I had to transpose them back into their actual shape.
The problem is whenever I try to display them using either matplotlib's show() method or scipy's toimage(), they show a rotated version of the image. So we have :

img_set = np.transpose(data_train,(0,3,2,1))
#trying to display an image using pyplot, makes it look like this:  
plt.subplot(1,2,1)
plt.imshow(img_set[0])

enter image description here

showing the same image using toimage :

enter image description here

Now if I dont transpose data_train, pyplot's show() generates an error while toimage() displays the image well:
enter image description here

What is happening here?
When I feed the transposed data_train to my augmenter, I also get the result rotated just like previous examples.
Now I'm not sure whether this is a displaying issue, or the actual images are indeed rotated!
What should I do ?

1 Answer 1

10

First, look closely. The transoposed array is not rotated but mirrored on the diagonal (i.e. X and Y axes are swapped).

The original shape is (3,32,32), which I interpret as (RGB, X, Y). However, imshow expects an array of shape MxNx3 - the color information must be in the last dimension.

By transposing the array you invert the order of dimensions: (RGB, X, Y) becomes (Y, X, RGB). This is fine for matplotlib because the color information is now in the last dimension but X and Y are swapped, too. If you want to preserve the order of X, Y you can tell transpose to do so:

import numpy as np

img = np.zeros((3, 32, 64))  # non-square image for illustration

print(img.shape)  # (3, 32, 64)
print(np.transpose(img).shape)  # (64, 32, 3)
print(np.transpose(img, [1, 2, 0]).shape)  # (32, 64, 3)

When using imshow to display an image be aware of the following pitfalls:

  1. It treats the image as a matrix, so the dimensions of the array are interpreted as (ROW, COLUMN, RGB), which is equivalent to (VERTICAL, HORIZONTAL, COLOR) or (Y, X, RGB).

  2. It changes direction of the y axis so the upper left corner is img[0, 0]. This is different from matplotlib's normal coordinate system where (0, 0) is the bottom left.

Example:

import matplotlib.pyplot as plt

img = np.zeros((32, 64, 3))
img[1, 1] = [1, 1, 1]  # marking the upper right corner white

plt.imshow(img)

enter image description here

Note that the smaller first dimension corresponds to the vertical direction of the image.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks alot, very well explained ;)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.