1

I have a list of Numpy arrays of different shape.

I need to create a Dataset, so that each time an element is requested I get a tensor with the shape and values of the given Numpy array.

How can I achieve this?

This is NOT working:

dataset = tf.data.Dataset.from_tensor_slices(list_of_arrays)

since you get, as expected:

Can't convert non-rectangular Python sequence to Tensor.

p.s. I know that it will not be possible to batch a Dataset with elements of different shapes.

2 Answers 2

3

I've accepted the solution from Timbus Calin since is the more compact, but I've found another way that provides a lot of flexibility and its worth mentioning here.

Its based on generators:

def create_generator(list_of_arrays):
    for i in list_of_arrays:
        yield i

dataset = tf.data.Dataset.from_generator(lambda: create_generator(list_of_arrays),output_types= tf.float32, output_shapes=(None,4))
Sign up to request clarification or add additional context in comments.

Comments

2

Have you tried converting initially to a ragged tensor?

tensor_with_from_dimensions = tf.ragged.constant([[1, 2], [3], [4, 5, 6]])

Bear in mind that:

All scalar values in pylist must have the same nesting depth K, and the returned RaggedTensor will have rank K. If pylist contains no scalar values, then K is one greater than the maximum depth of empty lists in pylist. All scalar values in pylist must be compatible with dtype.

You can read more about it here : https://www.tensorflow.org/api_docs/python/tf/ragged/constant

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.