Passing Multiple Arguments to Generator in tf.data.Datasets.from_generator

Question

I have a folder containing images of various sizes. I need to resize them all to (160,120) and create a tf.data.Dataset. For this I used the tf.data.Dataset.from_generator. But I can't seem to figure out how to send the other arguments to the generator like target_size, class_mode etc other than the directory itself. [NOTE: I am using Tensorflow 2.0 Beta1]

The args parameter takes tf.Tensor objects and uses them as arguments for the generator. I tried passing all the arguments as a list.

# gen is the ImageDataGenerator

real_imgs_dataset = tf.data.Dataset.from_generator(gen.flow_from_directory,
                                      args = 
                                            (data_path,    # DIRECTORY
                                            (160, 128),    # TARGET SIZE
                                            'rgb',         # COLOR MODE
                                             None,         # CLASSES
                                             None,         # CLASS MODE
                                             32,           # BATCH SIZE
                                             True),        # SHUFFLE
                                        output_types = tf.float32, 
                                        output_shapes = ([None,160,128,3])
                                       )

What I wanted to do was to pass the images through the generator, and the generator would spit out images batch by batch and create a tf.data.Dataset. However, when I try to run the above snippet, I get an error saying:-

"ValueError: Attempt to convert a value (None) with an unsupported type () to a Tensor"

dhassault · Accepted Answer · 2019-10-09 00:48:44Z

2

I had a similar issue that I reported there: https://github.com/tensorflow/tensorflow/issues/33133

The proposed solution adapted to your case is:

gen_mod = gen.flow_from_directory(directory=data_path, target_size=(
    160, 128), class_mode=None, batch_size=32, shuffle=True)

real_imgs_dataset = tf.data.Dataset.from_generator(
    lambda: gen_mod,
    output_types=tf.float32,
    output_shapes=([None, 160, 128, 3])
)

Hope it helped!

answered Oct 9, 2019 at 0:48

dhassault

916 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Yukimura Ryuji · Accepted Answer · 2021-09-14 01:39:38Z

0

The solution using lambda worked in part. But in my current environment(TensorFlow2.6.0), I have observed that the values generated by lambda won't reset between epochs.

As a workaround, it is possible to make generators have double layers as shown below.

def new_gen():
    gen_mod = gen.flow_from_directory(directory=data_path, target_size=(
        160, 128), class_mode=None, batch_size=32, shuffle=True)

    for x in gen_mod:
        yield x

real_imgs_dataset = tf.data.Dataset.from_generator(
    new_gen,
    output_types=tf.float32,
    output_shapes=([None, 160, 128, 3])
)

answered Sep 14, 2021 at 1:39

Yukimura Ryuji

1011 silver badge3 bronze badges

Collectives™ on Stack Overflow

Passing Multiple Arguments to Generator in tf.data.Datasets.from_generator

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related