Skip to content

Conversation

@Michal-Novomestsky
Copy link
Contributor

Addresses #1743

Copy link
Member

@ricardoV94 ricardoV94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry this is actually masking an error. A ScalarFromTensor should have a scalar gradient, and therefore tensor_from_scalar is correct

@Michal-Novomestsky
Copy link
Contributor Author

Sorry this is actually masking an error. A ScalarFromTensor should have a scalar gradient, and therefore tensor_from_scalar is correct

Sorry I'm not quite following - do you mean that there is a deeper bug unrelated to this code, and fixing that bug means that this PR is unecessary?

@ricardoV94
Copy link
Member

There's a deeper bug, your fix is making it.
Assumptions:

  1. This Op can only be called in a Scalar, it always outputs a Tensor
  2. Any gradient expression coming in reverse_mode (output_gradients) must have the same type as the output, i.e, be a Tensor
  3. It should always be safe and correct to call ScalarFromTensor on this output.

Point 2. is the crux. Some Op gradient is returning a scalar when it should return a tensor, or some invalid graph manipulation was done. Need to find what variable is creating that tensor output gradient. Can you check if the variable has a traceback in var.tag?

@ricardoV94
Copy link
Member

Do you have a MWE that reproduces the bug?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants