2

I am running into the classic: CUDA out of memory.

What I want to do: I want to load the same model using a different matrix of embeddings every time. I have to do that 300 times, one for each dimension of the word embeddings.

I am not training the model, that is why I am using model.eval(), I thought that would be enough to keep Pytorch from creating a graph.

Please notice that I never pass the model, nor the data, to cuda. In fact, I wanted to debug the code using cpu before sending the code to be performed by a GPU.

The loop below is executed once, a RuntimeError is raised in the second iteration.

My guess is that the code is loading a new model into a GPU memory at each iteration (which I did not know was possible without explicitly pointing it to do so). The emb_matrix is quite heavy and could cause the GPU memory to crash.

emb_dim = 300
acc_dim = torch.zeros((emb_dim, 4))
for d in range(emb_dim):

    #create embeddings with one dimension shuffled
    emb_matrix = text_f.vocab.vectors.clone()

    #get a random permutation across one of the dimensions
    rand_index = torch.randperm(text_f.vocab.vectors.shape[0])
    emb_matrix[:, d] =  text_f.vocab.vectors[rand_index, d]

    #load model with the scrumbled embeddings
    model = load_classifier(emb_matrix, 
                            encoder_type = encoder_type)
    model.eval()
    for batch in batch_iters["test"]:
        x_pre = batch.premise
        x_hyp = batch.hypothesis
        y = batch.label

        #perform forward pass
        y_pred = model.forward(x_pre, x_hyp)        

        #calculate accuracies
        acc_dim[d] += accuracy(y_pred, y)/test_batches

        #avoid memory issues
        y_pred.detach()

    print(f"Dimension {d} accuracies: {acc_dim[d]}")   

I get the following error: RuntimeError: CUDA out of memory. Tried to allocate 146.88 MiB (GPU 0; 2.00 GiB total capacity; 374.63 MiB already allocated; 0 bytes free; 1015.00 KiB cached)

I tried passing the model and the data to CPU, but I get precisely the same error.

I looked around for how to fix the problem, but I could not find an obvious solution. Any suggestions on how to load the model and data in the correct place, or how to clean the GPU's memory after each iteration are welcome.

3
  • torch.cuda.empty_cache() Commented Apr 22, 2019 at 20:26
  • 1
    To get it to run completely on the CPU for debugging, before running your program run the command export CUDA_VISIBLE_DEVICES=-1 This ensures that you wont be able to use the GPU and thus won't run out of GPU mem. Commented Apr 22, 2019 at 20:31
  • @Chris toch.cuda.empty_cache() does not free GPU memory right? It just releases it to the OS, but it will not unload the model from the GPU. pytorch.org/docs/stable/cuda.html#torch.cuda.empty_cache Commented Apr 22, 2019 at 20:35

1 Answer 1

1

It looks like acc_dim accumulates the grad history - see https://pytorch.org/docs/stable/notes/faq.html

Because you're only do inference, with torch.no_grad(): should be used. This will completely sidestep the possible issue with accumulating grad history.

model.eval() doesn't prevent grad bookkeeping from happening, it just switches behavior of some layers like dropout. Both model.eval() and with torch.no_grad(): together should be used for inference.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.