CUDA out of memory error when reloading Pytorch model

Question

Common pytorch error here, but I'm seeing it under a unique circumstance: when reloading a model, I get a CUDA: Out of Memory error, even though I haven't yet placed the model on the GPU.

model = model.load_state_dict(torch.load(model_file_path))
optimizer = optimizer.load_state_dict(torch.load(optimizer_file_path))
# Error happens here ^, before I send the model to the device.
model = model.to(device_id)

Jacob Stern · Accepted Answer · 2021-08-05 21:56:13Z

9

The issue is that I was trying to load to a new GPU (cuda:2) but originally saved the model and optimizer from a different GPU (cuda:0). So even though I didn't explicitly tell it to reload to the previous GPU, the default behavior is to reload to the original GPU (which happened to be occupied).

Adding map_location=device_id to each torch.load call fixed the problem:

model.to(device_id)
model = model.load_state_dict(torch.load(model_file_path, map_location=device_id))
optimizer = optimizer.load_state_dict(torch.load(optimizer_file_path, map_location=device_id))

edited Aug 5, 2021 at 21:56

answered Aug 5, 2021 at 17:10

Jacob Stern

4,7675 gold badges41 silver badges64 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

CUDA out of memory error when reloading Pytorch model

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related