0

my mu cannot have any valid gradient somehow, here is the code:

import torch
torch.manual_seed(0)
mu = torch.zeros(1, requires_grad=True)
sigma = 1.0
eps = torch.randn(1)
sampled = mu + sigma * eps
logp = -((sampled - mu)**2) / 2 - 0.5 * torch.log(torch.tensor(2 * torch.pi))
loss = -logp.sum()
loss.backward()
print("eps:", eps.item())
print("mu.grad:", mu.grad.item())  # should be -eps.item()

I consistently get zero grad, is this normal?

1
  • 2
    mu + sigma * eps - mu will cancel mu Commented Apr 14 at 20:00

1 Answer 1

0

As explained by @Oscar in comments, the gradient is propagating fine, but your logp does not actually depend on mu (so nor does your loss).

If the gradient was not propagating to mu, mu.grad wouldn't be zero, it would be None.

this is because

sampled = mu + sigma * eps
logp = -((sampled - mu)**2) / 2 - 0.5 * torch.log(torch.tensor(2 * torch.pi))

so in practice, inlining sampled into the definition of logp

logp = -((mu + sigma * eps - mu)**2) / 2 - 0.5 * torch.log(torch.tensor(2 * torch.pi))

ie, after simplification

logp = -((sigma * eps)**2) / 2 - 0.5 * torch.log(torch.tensor(2 * torch.pi))

so logp really doesn't depend on mu, and the gradient related to mu is 0.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.