Torch.unique() alternatives that do not break gradient flow?

Question

In a Pytorch gradient descent algorithm, the function

def TShentropy(wf):
    unique_elements, counts = wf.unique(return_counts=True)
    entrsum = 0
    for x in counts:
        p = x/len_a #Calculates probability of x
        entrsum-= p*torch.log2(p) #Shannon's Entropy Formula        
    return entrsum

uses the method torch.unique() which is breaking the gradient flow. Whenever I switch it to a continuous probability calculation such as torch.softmax() the program runs. However, the formula needs to use a discrete probability mass distribution, which does not work with softmax.

I have tried using torch.nn.functional.one_hot and torch.bincount, both of which gave the same error:

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

Is this doomed to fail? Should I try to interpolate the probability function somehow?

Lelouch · Accepted Answer · 2024-06-13 08:39:00Z

1

I am not fully sure if I unsterdood what you are trying to do.

However, if you want to do sampling from a discrete probability in a differentiable way, you probably need to use the gumbel-softmax trick. It allows for sampling from a categorical distribution during the forward pass through a neural network :

https://pytorch.org/docs/stable/generated/torch.nn.functional.gumbel_softmax.html

https://sassafras13.github.io/GumbelSoftmax/

answered Jun 13, 2024 at 8:39

Lelouch

3031 silver badge5 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Torch.unique() alternatives that do not break gradient flow?

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related