I want to implement a neural network on pytorch where gradients are not computed over all the weights. Let's say for example I have an MLP with three layers and I want half of the nodes in the last layer to have their backpropagation computed all the way up to the first layer but the other half of the last layer have their gradients computed only up to the middle layer. I would be grateful for any help. Thanks
-
What actually means 'not computing', just skipping the gradient or replacing to another?ZombaSY– ZombaSY2025-02-03 09:06:40 +00:00Commented Feb 3 at 9:06
-
It means the error in the 2nd half of the nodes in the last layer will be used to update the weights from the middle layer to the last layer but will not affect the weights from the first to the middle layerdanix– danix2025-02-04 09:58:40 +00:00Commented Feb 4 at 9:58
Add a comment
|