1

I want to perform the following contraction

np.einsum('ijk,kl,jlm', x, y, z, optimize = 'optimal')

Testing performance with numpy I know that for my data, the optimal path is almost allways (if this makes sense): first k then j, then l

But just in case I want to allow pytorch to find the optimal path. However reading the documentation I found the following:

This optimization occurs when there are at least three inputs, since the order does not matter otherwise. Note that finding the optimal path is an NP-hard problem, thus, opt_einsum relies on different heuristics to achieve near-optimal results

Maybe I don't understand fully what the optimal path actually is, but why does it say it's a NP-hard problem. In essence isn't it equivalent to a sorting algorithm? Or what exactly makes it scale non polynomially (number of tensors, number of contractions or total number of indices)?

Furthermore, I want to incorporate this structure for a custom layer, where my learnable parameters will be in the tensor z. In order to facillitate the backward propagation, should I separate the product in order to make the least number of operations with my learnable parameters i.e. the equivalent in pytorch to:

np.einsum('ijk,jkl', np.einsum('ijk,kl', x, y, optimize = 'optimal'), z, optimize = 'optimal')

or should I stick with the optimal path?

1
  • Why not using torch.einsum instead of numpy einsum? Commented Feb 16, 2024 at 12:28

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.