0

tf.custom_gradient accepts only one Tensor x, what if this op needs more than one inputs?

For example, to define the gradient of Softmax which needs input x and label?

Update

Thanks for the suggestion from @AllenLavoie, I use a Python list as input.

def self_define_op_multiple_inputs():
    @tf.custom_gradient
    def loss_func(input_):
        x = input_[0]
        label = input_[2]

        def grad(dy):
            return [dy, dy]

        return x - label, grad

    x = tf.range(10, dtype=tf.float32)
    y = tf.range(10, dtype=tf.int32)

    loss = loss_func([x, y])


if __name__ == '__main__':
    self_define_op_multiple_inputs()

It seems that it will convert the Python list to a Tensor. The snippet above will raise a TypeError: TypeError: Cannot convert a list containing a tensor of dtype <dtype: 'int32'> to <dtype: 'float32'> (Tensor is: <tf.Tensor 'range_1:0' shape=(10,) dtype=int32>)

How to fix it?

6
  • The documentation says x and y can both either be Tensors or sequences of Tensors. Did this not work for you? Commented Aug 14, 2018 at 16:32
  • @AllenLavoie Actually this is exactly what confused me. I don't understand what's sequences of Tensors, does it mean a Python list of Tensor? Commented Aug 15, 2018 at 2:37
  • My interpretation is Python list (or tuple, etc.). So len(x) is the number of inputs to the operation, len(y) is the number of outputs. Then the gradient function takes len(y) Tensor argument and returns len(x) Tensors. Commented Aug 15, 2018 at 18:33
  • @AllenLavoie I tried to use list but it seems like a list will be converted as a Tensor, which will cause an error if there are multiple inputs with different type and matched shape. The question has been updated. Commented Aug 16, 2018 at 2:23
  • 1
    @AllenLavoie I created an issue on github Commented Aug 21, 2018 at 10:09

2 Answers 2

2

I ran into a similar problem yesterday and found this post, and I believe I know what you are running into. Problem is that while using @tf.custom_gradient, the function that it decorates can have multiple inputs (instead of a list of tensors). Look at the following code(note that it's just a test code with no actual meaning):

@tf.custom_gradient
def loop1(x,a):
    def grad(dy):
        return dy*3,dy*2
    n = tf.multiply(x,a)
    return n,grad

By using two inputs x and a, you have to return two gradients respectively in the grad function. dy*3 corresponds to the gradient of x and dy*2 corresponds to the gradient of a.

I think in this function the documents make people very confusing, but you can still use multiple inputs, just make sure that you also have the same number of gradients, or else you will run into errors.

Sign up to request clarification or add additional context in comments.

1 Comment

Can we return None as gradients for unused terms ?
0

I believe you need something like this a tf Graph input:+ n_input is the input number

x = tf.placeholder("float", [None, n_input])
y = tf.placeholder("float", [None])

Does this answer your question ?

1 Comment

Thanks for your help. But it seems that you didn't understand what I am talking about. tf.custom_gradient is not a computational graph. You can read docs for more details.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.