18

Is there a way to load a pytorch DataLoader (torch.utils.data.Dataloader) entirely into my GPU?

Now, I load every batch separately into my GPU.

CTX = torch.device('cuda')

train_loader = torch.utils.data.DataLoader(
    train_dataset,
    batch_size=BATCH_SIZE,
    shuffle=True,
    num_workers=0,
)

net = Net().to(CTX)
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=LEARNING_RATE)

for epoch in range(EPOCHS):
    for inputs, labels in test_loader:
        inputs = inputs.to(CTX)        # this is where the data is loaded into GPU
        labels = labels.to(CTX)        

        optimizer.zero_grad()

        outputs = net(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

    print(f'training accuracy: {net.validate(train_loader, device=CTX)}/{len(train_dataset)}')
    print(f'validation accuracy: {net.validate(test_loader, device=CTX)}/{len(test_dataset)}')

where the Net.validate() function is given by

def validate(self, val_loader, device=torch.device('cpu')):
    correct = 0
    for inputs, labels in val_loader:
        inputs = inputs.to(device)
        labels = labels.to(device)
        outputs = torch.argmax(self(inputs), dim=1)
        correct += int(torch.sum(outputs==labels))
    return correct

I would like to improve the speed by loading the entire dataset trainloader into my GPU, instead of loading every batch separately. So, I would like to do something like

train_loader.to(CTX)

Is there an equivalent function for this? Because torch.utils.data.DataLoader does not have this attribute .to().

I work with an NVIDIA GeForce RTX 2060 with CUDA Toolkit 10.2 installed.

2
  • why did you set num_workers to 0 ? If you want it to be faster you should increase that numbers I guess Commented Dec 16, 2020 at 16:53
  • 1
    @TheodorPeifer 0 makes use of all available workers. Commented Mar 26, 2023 at 10:46

2 Answers 2

14

you can put your data of dataset in advance

train_dataset.train_data.to(CTX)  #train_dataset.train_data is a Tensor(input data)
train_dataset.train_labels.to(CTX)

for example of minst

import torch
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision import transforms
batch_size = 64
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.1307,), (0.3081,))
])
train_data = datasets.MNIST(
    root='./dataset/minst/',
    train=True,
    download=False,
    transform=transform
)
train_loader = DataLoader(
    dataset=train_data,
    shuffle=True,
    batch_size=batch_size
)
train_data.train_data = train_data.train_data.to(torch.device("cuda:0"))  # put data into GPU entirely
train_data.train_labels = train_data.train_labels.to(torch.device("cuda:0"))
# edit note for newer versions: use train_data.data and train_data.targets instead

I got this solution by using debugger...

Sign up to request clarification or add additional context in comments.

3 Comments

Note for more recent versions of pytorch, you'll want to refer to train_data as data and train_labels as target eg train_data.data.to(torch.device("cuda:0")) and train_data.target.to(torch.device("cuda:0"))
print(train_data.data.device) after this just gives me "cpu", trying to assign to train_data.data blows up with an error later: RuntimeError: _share_filename_: only available on CPU
This doesn't seem to work when we're using a dataloader...
5

In the "Wrapping Dataloader" part in this tutorial (https://pytorch.org/tutorials/beginner/nn_tutorial.html), data are loaded into GPU entirely. The wrapper dataloader code is as follows:

def preprocess(x, y):
    return x.view(-1, 1, 28, 28).to(dev), y.to(dev)

train_dl, valid_dl = get_data(train_ds, valid_ds, bs)
train_dl = WrappedDataLoader(train_dl, preprocess)
valid_dl = WrappedDataLoader(valid_dl, preprocess)

1 Comment

This is the only way I found to be working on Colab

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.