How to read numerical data from CSV in PyTorch?

Question

I'm new to PyTorch; trying to implement a model I developed in TF and compare the results. The model is an Autoencoder model. The input data is a csv file including n samples each with m features (a n*m numerical matrix in a csv file). The targets (the labels) are in another csv file with the same format as the input file. I've been looking online but couldn't find a good documentation for reading non-image data from csv file with multiple labels. Any idea how can I read my data and iterate over it during training?

Thank you

why don't you use pandas to load the dataset and then pytorch-related classes to frame it inside the tensors? — inverted_index
– inverted_index, Commented May 7, 2020 at 19:08
Thanks for your comment! I'm looking for something similar to tf.data.experimental.make_csv_dataset in TF. So I can shuffle the data and stream the data without needing to manually create batches of data. — khemedi
– khemedi, Commented May 7, 2020 at 19:14

Cecilia · Accepted Answer · 2020-05-07 23:12:11Z

3

Might you be looking for something like TabularDataset?

class torchtext.data.TabularDataset(path, format, fields, skip_header=False, csv_reader_params={}, **kwargs)

Defines a Dataset of columns stored in CSV, TSV, or JSON format.

It will take a path to a CSV file and build a dataset from it. You also need to specify the names of the columns which will then become the data fields.

In general, all of implementations of torch.Dataset for specific types of data are located outside of pytorch in the torchvision, torchtext, and torchaudio libraries.

answered May 7, 2020 at 23:12

Cecilia

4,7613 gold badges39 silver badges80 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

How to read numerical data from CSV in PyTorch?

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related