2

I'm trying to use TensorFlow to minimize the below loss function (L) with respect to u. There are 3 variables, u, x_opt, L, with the following dependency graph:

u ---(f)--> x_opt ---(g)--> L,

with the exact form of the dependency governed by functions f and g.

def f(u):

    def f_helper(u,x):
        # with u held fixed, f_helper is a convex function of x
        # the exact form of f_helper does not matter
        return np.linalg.norm(x-u)

    curried_f_helper = lambda x: f_helper(u,x)
    x_opt = scipy.optimize(curried_f_helper,np.random.uniform(5))['x']
    return x_opt

def g(x_opt):
    # the exact form of g does not matter
    return np.ones(x_opt.shape).dot(x_opt)

def L(u):
    # want to optimize L over u
    x_opt = f(u)
    return g(x_opt)

# use TensorFlow to minimize L over u...

The complication is that f() does not have an analytical functional form - u parameterizes an optimization problem whose solution is x_opt. So TensorFlow would not be able to compute the gradient of f with respect to u. However, I can use implicit differentiation to manually compute this gradient. Ideally, I'd be able to define a new op representing f, and register its gradient (that I manually calculate).

My question is: How should I implement the op representing f and specify its gradient? Is it possible to define the op for f using only Python, and if so, will I have to use tf.pyfunc?

6
  • Am I correct in understanding that your question isn't really about implicit differentiation (that's just an implementation detail), but it's really about manually specifying a gradient function? Commented Jun 11, 2016 at 22:17
  • If that's the case, it looks like it can be done with Graph.gradient_override_map see this issue here, but it's ugly. Commented Jun 11, 2016 at 22:21
  • Interesting, it looks like there's a hack in this answer Commented Jun 11, 2016 at 22:26
  • Hi @JohnMoeller. Right, I've changed the title. That might take care of specifying the gradient (I need to read more), but what would be the easiest way to define the op? If given u, I could calculate, using only TensorFlow ops (like Minimizers) the value of f(u), but I'm confused where to place this logic. I want to do something like "define a new op, x_opt, that is the result of running 100 gradient descent steps over x on f_helper with u fixed." Commented Jun 11, 2016 at 23:00
  • That I don't know. The design of TF doesn't seem to accomodate custom gradients well. Commented Jun 11, 2016 at 23:01

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.