3

I am using PyTorch 1.8 and Python 3.8 to read images from a folder using the following code:

print(f"PyTorch version: {torch.__version__}")
# PyTorch version: 1.8.1

# Device configuration-
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"currently available device: {device}")
# currently available device: cpu


# Define transformations for training and test sets-
transform_train = transforms.Compose(
    [
      # transforms.RandomCrop(32, padding = 4),
      # transforms.RandomHorizontalFlip(),
      transforms.ToTensor(),
      # transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)),
     ]
     )

transform_test = transforms.Compose(
    [
      transforms.ToTensor(),
      # transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)),
     ]
     )

# Define directory containing images-
data_dir = 'My_Datasets/Cat_Dog_data/'

# Define datasets-
train_data = datasets.ImageFolder(data_dir + '/train', 
                                  transform = train_transforms)
test_data = datasets.ImageFolder(data_dir + '/test', 
                                 transform = test_transforms)

print(f"number of train images = {len(train_data)} & number of validation images = {len(test_data)}")
# number of train images = 22500 & number of validation images = 2500

print(f"number of training classes = {len(train_data.classes)} & number of validation classes = {len(test_data.classes)}")
# number of training classes = 2 & number of validation classes = 2

# Define data loaders-
trainloader = torch.utils.data.DataLoader(train_data, batch_size = 32)
testloader = torch.utils.data.DataLoader(test_data, batch_size = 32)

len(trainloader), len(testloader)
# (704, 79)

# Sanity check-
len(train_data) / 32, len(test_data) / 32

You can iterate through the train data using 'train_loader' as follows:

for img, lab in train_loader:
   print(img.shape, lab.shape)
   pass

However, I am interested in getting the file name along with the file path from which the file was read. How can I achieve this?

Thanks!

2 Answers 2

4

The default ImageFolder Dataset holds the paths of all images in self.samples. All you need to do is modify __getitem__ to return the paths as well.

Sign up to request clarification or add additional context in comments.

2 Comments

it's for image dataset stored in local system. Edited code above
sample_fnames, label = dataloaders_dict['test'].dataset.samples[i] gives me only 110 filenames which is my "number of images / batch size=512". How can I get the name of all of my images in the test loader? stackoverflow.com/questions/71430015/…
1

It would be useful if you can show us how you implemented your data loader.

If it is no possible, you can follow these 2 guides that would help you to understand how to customize the data you return in _getitem_:

reference 1: Multi-Class Classification Using PyTorch: Preparing Data (check Page 2 to see how _getitem_ is defined)

reference 2: Multi-Class Classification Using PyTorch: Training (check Page 2 to see how to use it)

What i would do is to add into this dictionary (taken from reference 1) the corresponding value of the path and the file name.

(modified from reference 1)

def __getitem__(self, idx):

  path = self.path[idx]
  fileName = self.fileName[idx]
  preds = self.x_data[idx]
  trgts = self.y_data[idx]

  sample = { 
    'predictors' : preds,
    'targets' : trgts,
    'path': path,
    'fileName': fileName
  }
  return sample

So, when you want to get its value in the model training implementation, just use the key to acced these values.

(modified from reference 2)

for (batch_idx, batch) in enumerate(train_ldr):

    X = batch['predictors']   
    Y = batch['targets']
    path = batch['path']
    fileName = batch['fileName']

    optimizer.zero_grad()
    oupt = net(X)
    # .....

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.