10

I am a beginner in pytorch and I face the following issue:

When I get the gradient of the below tensor (note that I use some variable x in some way as you can see below), I get the gradient:

import torch
myTensor = torch.randn(2, 2,requires_grad=True)
with torch.enable_grad():
    x=myTensor.sum() *10
x.backward()
print(myTensor.grad)

Now, if I try to modify an element of myTensor, I get the error of leaf variable has been moved into the graph interior. See this code:

import torch
myTensor = torch.randn(2, 2,requires_grad=True)
myTensor[0,0]*=5
with torch.enable_grad():
    x=myTensor.sum() *10
x.backward()
print(myTensor.grad)

What is wrong with my latter code? And how do I correct it?

Any help would be highly appreciated. Thanks a lot!

1
  • Thank you very much. I just added a comment indicating that your answer worked beautifully. Thanks again! On another related note, could you please take a look into this related one that I recently posted and haven't gotten any replies to yet?: stackoverflow.com/questions/53507346/… Commented Nov 29, 2018 at 10:50

1 Answer 1

17

The problem here is that this line represents an in-place operation:

myTensor[0,0]*=5

And PyTorch or more precisely autograd is not very good in handling in-place operations, especially on those tensors with the requires_grad flag set to True.

You can also take a look here:
https://pytorch.org/docs/stable/notes/autograd.html#in-place-operations-with-autograd

Generally you should avoid in-place operations where it is possible, in some cases it can work, but you should always avoid in-place operations on tensors where you set requires_grad to True.

Unfortunately there are not many pytorch functions to help out on this problem. So you would have to use a helper tensor to avoid the in-place operation in this case:

Code:

import torch

myTensor = torch.randn(2, 2,requires_grad=True)
helper_tensor = torch.ones(2, 2)
helper_tensor[0, 0] = 5
new_myTensor = myTensor * helper_tensor # new tensor, out-of-place operation
with torch.enable_grad():
    x=new_myTensor.sum() *10 # of course you need to use the new tensor
x.backward()                 # for further calculation and backward
print(myTensor.grad)

Output:

tensor([[50., 10.],
        [10., 10.]])

Unfortunately this is not very nice and I would appreciate if there would be a better or nicer solution out there.
But for all I know in the current version (0.4.1) you will have to got with this workaround for tensors with gradient resp. requires_grad=True.

Hopefully for future versions there will be a better solution.


Btw. if you activate the gradient later you can see that it works just fine:

import torch
myTensor = torch.randn(2, 2,requires_grad=False) # no gradient so far
myTensor[0,0]*=5                                 # in-place op not included in gradient
myTensor.requires_grad = True                    # activate gradient here
with torch.enable_grad():
    x=myTensor.sum() *10
x.backward()                                     # no problem here
print(myTensor.grad)

But of course this will yield to a different result:

tensor([[10., 10.],
        [10., 10.]])

Hope this helps!

Sign up to request clarification or add additional context in comments.

2 Comments

This worked beautifully! Thank you very much for the very clear and correct answer!
Done, voted and accepted it, with absolute appreciation :-). And yes, would appreciate you looking at the other one. Thanks a lot!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.