2

In a Pytorch gradient descent algorithm, the function

def TShentropy(wf):
    unique_elements, counts = wf.unique(return_counts=True)
    entrsum = 0
    for x in counts:
        p = x/len_a #Calculates probability of x
        entrsum-= p*torch.log2(p) #Shannon's Entropy Formula        
    return entrsum

uses the method torch.unique() which is breaking the gradient flow. Whenever I switch it to a continuous probability calculation such as torch.softmax() the program runs. However, the formula needs to use a discrete probability mass distribution, which does not work with softmax.

I have tried using torch.nn.functional.one_hot and torch.bincount, both of which gave the same error:

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

Is this doomed to fail? Should I try to interpolate the probability function somehow?

1 Answer 1

1

I am not fully sure if I unsterdood what you are trying to do.

However, if you want to do sampling from a discrete probability in a differentiable way, you probably need to use the gumbel-softmax trick. It allows for sampling from a categorical distribution during the forward pass through a neural network :

https://pytorch.org/docs/stable/generated/torch.nn.functional.gumbel_softmax.html

https://sassafras13.github.io/GumbelSoftmax/

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.