Problem with Dataloader object not subscriptable

Question

I am now running a Python program using Pytorch. I use my own dataset, not torch.data.dataset. I download data from a pickle file extracted from feature extraction. But the following errors appear:

Traceback (most recent call last):
  File "C:\Users\hp\Downloads\efficient_densenet_pytorch-master\demo-emotion.py", line 326, in <module>
    fire.Fire(demo)
  File "C:\Users\hp\Anaconda3\envs\tf-gpu\lib\site-packages\fire\core.py", line 138, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "C:\Users\hp\Anaconda3\envs\tf-gpu\lib\site-packages\fire\core.py", line 468, in _Fire
    target=component.__name__)
  File "C:\Users\hp\Anaconda3\envs\tf-gpu\lib\site-packages\fire\core.py", line 672, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "C:\Users\hp\Downloads\efficient_densenet_pytorch-master\demo-emotion.py", line 304, in demo
    train(model,train_set1, valid_set=valid_set, test_set=test1, save=save, n_epochs=n_epochs,batch_size=batch_size,seed=seed)
  File "C:\Users\hp\Downloads\efficient_densenet_pytorch-master\demo-emotion.py", line 172, in train
    n_epochs=n_epochs,
  File "C:\Users\hp\Downloads\efficient_densenet_pytorch-master\demo-emotion.py", line 37, in train_epoch
    loader=np.asarray(list(loader))
  File "C:\Users\hp\Anaconda3\envs\tf-gpu\lib\site-packages\torch\utils\data\dataloader.py", line 345, in __next__
    data = self._next_data()
  File "C:\Users\hp\Anaconda3\envs\tf-gpu\lib\site-packages\torch\utils\data\dataloader.py", line 385, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "C:\Users\hp\Anaconda3\envs\tf-gpu\lib\site-packages\torch\utils\data\_utils\fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "C:\Users\hp\Anaconda3\envs\tf-gpu\lib\site-packages\torch\utils\data\_utils\fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "C:\Users\hp\Anaconda3\envs\tf-gpu\lib\site-packages\torch\utils\data\dataset.py", line 257, in __getitem__
    return self.dataset[self.indices[idx]]
TypeError: 'DataLoader' object is not subscriptable

The code is:

train_set1 = Owndata()

train1, test1 = train_set1 .get_splits()
# prepare data loaders
train_dl = torch.utils.data.DataLoader(train1, batch_size=32, shuffle=True)
test_dl =torch.utils.data.DataLoader(test1, batch_size=1024, shuffle=False)
test_set1 = Owndata()
'''print('test_set# ',test_set)'''  
if valid_size:
    valid_set = Owndata()
    indices = torch.randperm(len(train_set1))
    train_indices = indices[:len(indices) - valid_size]
    valid_indices = indices[len(indices) - valid_size:]
    train_set1 = torch.utils.data.Subset(train_dl, train_indices)
    valid_set = torch.utils.data.Subset(valid_set, valid_indices)
else:
    valid_set = None
model = DenseNet(
    growth_rate=growth_rate,
    block_config=block_config,
    num_classes=10,
    small_inputs=True,
    efficient=efficient,
)
train(model,train_set1, valid_set=valid_set, test_set=test1, save=save, n_epochs=n_epochs, batch_size=batch_size, seed=seed)

Any help is appreciated!

Szymon Maszke · Accepted Answer · 2020-05-02 19:07:14Z

21

It is not the line giving you an error as it's the very last train function you are not showing.

You are confusing two things:

torch.utils.data.Dataset object is indexable (dataset[5] works fine for example). It is a simple object which defines how to get a single (usually single) sample of data.
torch.utils.data.DataLoader - non-indexable, only iterable, usually returns batches of data from above Dataset. Can work in parallel using num_workers. It's what you are trying to index while you should use dataset for that.

Please see PyTorch documentation about data to get a better grasp on how those work.

answered May 2, 2020 at 19:07

Szymon Maszke

25.2k4 gold badges54 silver badges92 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

W Kenny · Accepted Answer · 2024-04-30 14:28:52Z

1

Hope that everyone like me learning PyTorch can solve this problem

Try use this one

img, label = next(iter(dataloder))

It equals to imgs, label = dataloder[0]

If you want to loop the dataloader instead of print one of the image and label, try the following one

for data in dataloder:
    imgs, target = data

answered Apr 30, 2024 at 14:28

W Kenny

2,1091 gold badge26 silver badges37 bronze badges

Comments

Ka Wa Yip · Accepted Answer · 2024-08-19 20:23:11Z

1

Let's say dataloader is defined over the dataset with batch size of 4:

dataloader = torch.utils.data.DataLoader(dataset, batch_size=4, shuffle=True)

For small dataset, if you want to get the i-th batch of images and labels from the dataloader, one can do:

images, labels = list(dataloader)[i]

For large dataset, one can do:

dataiter = iter(dataloader)
images, labels = next(x for j,x in enumerate(dataiter) if j==i)

answered Aug 19, 2024 at 20:23

Ka Wa Yip

3,0594 gold badges27 silver badges39 bronze badges

Comments

Dongdong Kong · Accepted Answer · 2024-10-06 07:27:42Z

0

Based on the Solution of Ka Wa Yip, you can define a __getitem__ for DataLoader

from torch.utils.data import DataLoader, Dataset

def __getitem__(self, i):
  dataiter = iter(self)
  return next(x for j,x in enumerate(dataiter) if j==i)

DataLoader.__getitem__ = __getitem__

images, labels = dataloader[i]

answered Oct 6, 2024 at 7:27

Dongdong Kong

4163 silver badges15 bronze badges

Collectives™ on Stack Overflow

Problem with Dataloader object not subscriptable

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related