4d np.array() acting like a 1d np.array()

Question

Okay so this is a pretty long question. Here's the problem:

Problem: So I'm transfering images (3d arrays) into a numpy array. I'm trying to get a 4d array containing all of my images but for some reason, it prints out a 1d np.array containing 3d np.arrays.

Suggested solutions: - Images have different dimentions. This was the case but didn't help when resizing them.

First I'm transfering the each picture: cv2.imdecode(np.frombuffer(image_bytes, np.uint8), -1) converts my images from byte code to a 3d np array.

def transferPictures():
    x_dataset = []
    y_dataset = []
    x_decoded = []
    shoeNames = os.listdir(SHOES_DIRECTORY)
    print(len(shoeNames))
    for shoes in shoeNames[:20]:
        shoeDirectoriesPath = os.path.join(SHOES_DIRECTORY, shoes)
        if(os.path.isdir(shoeDirectoriesPath)):
            eachPicture = os.listdir(shoeDirectoriesPath)
            for pic in eachPicture:
                print(pic)
                picPath = os.path.join(shoeDirectoriesPath, pic)
                f = open(picPath, 'rb')
                image_bytes = f.read()
                #This is to convert the image from byte format to a 3d np array:
                decoded = cv2.imdecode(np.frombuffer(image_bytes, np.uint8), -1) 
                x_dataset.append(decoded) 
                y_dataset.append(pic)
    return [x_dataset, y_dataset]

And converting my x_dataset list (containing all my images) to an np.array():

dataset = transferPictures()
x_dataset=np.array(dataset[0])

I get a 1d x_dataset np.array containing 3d np.arrays:

print("This is the type of the array of images, x_dataset: "+ str(type(x_dataset)))
print("This is the type of each image, x_dataset[0]: "+ str(type(x_dataset[0])))

print("This is the dimension of the array of images, x_dataset: "+ str(np.shape(x_dataset)))
print("This is the dimension of each image, x_dataset[0]: "+ str(np.shape(x_dataset[0])))

Prints:

This is the type of the array of images, x_dataset: <class 'numpy.ndarray'>
This is the type of each image, x_dataset[0]: <class 'numpy.ndarray'>
This is the dimension of the array of images, x_dataset: (130,)
This is the dimension of each image, x_dataset[0]: (289, 200, 3)

Why does the np.shape(x_dataset) return (130,) and the np.shape(x_dataset[0]) return (289, 200, 3)?

1) Shouldn't the np.shape(x_dataset) return (130, 289, 200, 3)?

2) How can I get a 4d x_dataset with shape (130, 289, 200, 3)?

EDIT:

V. Ayrat suggested it might be because of the difference in shape of each image. Some images were in fact of different dimensions that others. So I resized all images to have the same shape. But, the problem seems to persist:

Finding the dimensions of the smallest image:

minx = 1000
miny = 1000
minz = 1000


for image in x_dataset:
    if np.shape(image)[0] < minx:
        minx = np.shape(image)[0]
    if np.shape(image)[1] < miny:
        miny = np.shape(image)[1]
    if np.shape(image)[2] < minz:
        minz = np.shape(image)[2]

print(minx)
print(miny)
print(minz)

Prints:

288
200
3

Resizing images:

for i in range(0,len(x_dataset)):
    x_dataset[i] = np.resize(x_dataset[i], (minx,miny,minz))

The images have all the same dimensions but checking for the dimension x_dataset still returns (130,). Any other ideas to have it return a 4d array?

EDIT 2: Making 100% sure each image is of the same dimension:

minx = 1000
miny = 1000
minz = 1000
maxx = 0
maxy = 0
maxz = 0

for image in x_dataset:
    if np.shape(image)[0] < minx:
        minx = np.shape(image)[0]
    if np.shape(image)[1] < miny:
        miny = np.shape(image)[1]
    if np.shape(image)[2] < minz:
        minz = np.shape(image)[2]
    if np.shape(image)[0] > maxx:
        maxx = np.shape(image)[0]
    if np.shape(image)[1] > maxy:
        maxy = np.shape(image)[1]
    if np.shape(image)[2] > maxz:
        maxz = np.shape(image)[2] 
print()    
print(minx)
print(miny)
print(minz)
print(maxx)
print(maxy)
print(maxz)

Prints:

And rechecking the type:

print(type(x_dataset))
print(type(x_dataset[0]))
print(np.shape(x_dataset))

Prints:

<class 'numpy.ndarray'>
<class 'numpy.ndarray'>
(130,)

This is really confusing...

Antoine Neidecker · Accepted Answer · 2020-05-16 16:59:27Z

3

x_dataset is python list object which store nunpy.array objects. If inner numpy arrays have different sizes np.shape return only length of list. So you can make it full numpy object only if all pictures have the same size which is not case I guess.

In order to check if your array of images, x_dataset, has images of different sizes, run:

minx = 10000
miny = 10000
minz = 10000
maxx = 0
maxy = 0
maxz = 0

for image in x_dataset:
    if np.shape(image)[0] < minx:
        minx = np.shape(image)[0]
    if np.shape(image)[1] < miny:
        miny = np.shape(image)[1]
    if np.shape(image)[2] < minz:
        minz = np.shape(image)[2]
    if np.shape(image)[0] > maxx:
        maxx = np.shape(image)[0]
    if np.shape(image)[1] > maxy:
        maxy = np.shape(image)[1]
    if np.shape(image)[2] > maxz:
        maxz = np.shape(image)[2] 

print(minx)
print(miny)
print(minz)
print(maxx)
print(maxy)
print(maxz)

Then if images are of different dimensions, resize them with:

for i in range(0,len(x_dataset)):
    x_dataset[i] = np.resize(x_dataset[i], (minx,miny,minz))

If the problem persists, change your arrays to a list and then back to an array as so:

x_dataset = np.array(x_dataset.tolist())
x_dataset = np.array(x_dataset)

edited May 16, 2020 at 16:59

Antoine Neidecker

8617 silver badges15 bronze badges

answered May 16, 2020 at 14:55

V. Ayrat

2,74912 silver badges10 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

Antoine Neidecker Over a year ago

Thanks for your answer V. Ayrat. I followed your advice but there seems to be another problem as well. I added my progress in the question. Would you have any other ideas?

V. Ayrat Over a year ago

Try x_dataset = np.array(x_dataset). If all sizes are the same x_dataset.shape should be (130, 288, 200, 3).

V. Ayrat Over a year ago

I am just not sure what is the type of x_dataset after all your changes if it is still list x_dataset = np.array(x_dataset) should work.

Antoine Neidecker Over a year ago

I just checked your suggestions but it doesn't seem to be the problem.... The checking is in the Edit2 section

V. Ayrat Over a year ago

All this lists and arrays become very entangled)): can you try x_dataset = np.array(x_dataset.tolist())?

|

Collectives™ on Stack Overflow

4d np.array() acting like a 1d np.array()

1 Answer 1

6 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

6 Comments

Your Answer

Sign up or log in

Post as a guest

Related