2

I have several videos, which I have loaded frame by frame into a numpy array of arrays. For example if I have 8 videos, they are converted into an 8 dimensional numpy array of arrays where each inner array has a different dimension depending on the number of frames of the individual video. When I print

array.shape

my output is (8,)

Now I would like to create a dataloader for this data, and for that I would like to convert this numpy array into a torch tensor. However when I try to convert it using the torch.from_numpy or even simply the torch.tensor functions I get the error

TypeError: can't convert np.ndarray of type numpy.object_. The only supported types are: float64, float32, float16, int64, int32, int16, int8, uint8, and bool.

which I assume is because my inner arrays are of different sizes. One possible solution is to artificially add a dimension to my videos to make them be of the same size and then use np.stack but that may lead to possible problems later on. Is there any better solution?

Edit: Actually adding a dimension won't work because np.stack requires all dimensions to be the same.

Edit: Sample Array would be something like:

[ [1,2,3], [1,2], [1,2,3,4] ]

This is stored as a (3,) shaped np array. The real arrays are actually 4-dimensional( Frames x Height x Width x Channels), so this is just an example.

3
  • Please Add sample array. Commented May 23, 2020 at 9:36
  • One option is to adjust all sizes to be the same by padding with zeros. Second and in my opinion better way is just create separate tensor for each video and store them in a list. Commented May 23, 2020 at 10:31
  • @V.Ayrat Yeah for the moment I guess I am going to create a list and then use a dataloader on the list. Commented May 23, 2020 at 12:41

1 Answer 1

1

You can use rnn util function pad_sequence to make them same size.

ary
array([list([1, 2, 3]), list([1, 2]), list([1, 2, 3, 4])], dtype=object)

from torch.nn.utils.rnn import pad_sequence
t = pad_sequence([torch.tensor(x) for x in ary], batch_first=True)

t
tensor([[1, 2, 3, 0],
        [1, 2, 0, 0],
        [1, 2, 3, 4]])
t.shape
torch.Size([3, 4])
Sign up to request clarification or add additional context in comments.

1 Comment

This would work fine in most cases, however I would like to avoid having to modify my images in any way before passing into the network, apart from normalization.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.