13

I have a large image dataset. When I use the images, I have several components--a mirrored image, a regular image, an eigenvector matrix and an eigenvalue vector.

I would like to store it like:

training_sunsets_data = [cropped_training_sunsets,
                         mirrored_training_sunsets,
                         rgb_cov_eigvec_training_sunsets,
                         rgb_cov_eigval_training_sunsets]

np.save('training_sunsets_data',training_sunsets_data)

And as I was writing this I was testing it (because I was sure it would fail), and the strangest thing happened when I did this: it worked.

Further, when I loaded it back up into the code, it was type ndarray, but it is a jagged array.


How is this possible if numpy does not allow jagged multidimensional arrays? Did I just find a backdoor way to create a jagged array in numpy?

3
  • Can you post the jagged ndarray you got from loading the file? I am curious to see what it looks like. Commented Apr 22, 2016 at 23:37
  • 3
    Look at np.savez. That saves each array by name in a file, and collects them in a zip archive. np.load handles that kind of archive. Commented Apr 22, 2016 at 23:49
  • There's a very good answer here: stackoverflow.com/questions/48603119/… Commented Sep 4, 2019 at 4:42

3 Answers 3

13

After testing on my machine:

  import numpy as np
  np.save('testnp.npy', [[2,3,4],[1,2]])
  np.load('testnp.npy')
  #   array([[2, 3, 4], [1, 2]], dtype=object)

As shown in the example code, the loaded object is of type ndarray, but its data type is object. That means, np.save store an array of python objects, which can be anything. According to the documentation, it seems to use python pickle to pack those objects.

So you didn't find a backdoor, it behaves just as expected.

Sign up to request clarification or add additional context in comments.

1 Comment

...and you could as well use pickle to store sets/sequences of arrays, if you wish.
2

np.savez() would work in your situation. save each as a variable.

Comments

0

So to see what you are getting at lets runs some code.

>>> a =[np.array([[1,2,3],[4,5,6]]),np.array([[1,2],[3,4]])]
>>> type(a)
<type 'list'>
>>> np.array(a)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: could not broadcast input array from shape (2,3) into shape (2)

We see here that we are perfectly able to make a list of np.arrays of different dimensions. We cannot however cast that list into a np.array.

I suspect based on your syntax that you are saving a list, and loading a list maintaining the type np.array for each element in the list.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.