Skip to main content
Filter by
Sorted by
Tagged with
3 votes
1 answer
89 views

Let's say I have a parameter which is a p-shaped vector and I wish to train it in PyTorch such that: for some iterations, only the first k <= p elements of this vector were trained whereas the rest ...
L2m's user avatar
  • 21
1 vote
0 answers
106 views

I am coding a linear regression code in python,I used the formulas I learnt and checked them up, and also tried normalising the the dataset what happened then is the values of weight and bias changed ...
ADITYA KUNDU's user avatar
0 votes
1 answer
61 views

As far as I know, JAX only supports "rank 1" vector-valued function for the jax.jacrev autograd. How do I support higher rank tensors? I don't want to flatten my matrix, then unflatten it ...
Mingruifu Lin's user avatar
0 votes
0 answers
54 views

I am learning tensorflow and spent a good amount of time trying to find what is causing this error: No gradients provided for any variable. In the end I tracked that it was caused by using argmax at ...
Tomáš Zato's user avatar
0 votes
0 answers
44 views

I'm faced with a problem where as the title says I'm having trouble with the torch package's built in automatic differentiation algorithms (or my usage?). I think it was meant to be used on mini-...
Nomi Mino's user avatar
1 vote
0 answers
28 views

I'm trying to implement a custom agent, and inside my agent I'm running into issues with obtaining the gradient of the Q value with respect to my actor network parameters. I have my code below, main ...
Sliferslacker's user avatar
0 votes
0 answers
31 views

Problem: I have implemented several step-size strategies (classic, Polyak, and Adagrad), but my subgradient algorithm either diverges or fails to converge. Initially, I focused on the problem: Initial ...
Titouan Brochard's user avatar
2 votes
1 answer
102 views

I am trying to apply a very simple parameter estimation of a SIR model using a gradient descent algorithm. I am using the package autograd since the audience (this is for a sort of workshop for ...
Alonso Ogueda Oliva's user avatar
0 votes
1 answer
51 views

i made a gradient descent code but it doesnt seem to work well import numpy as np from random import randint,random import matplotlib . pyplot as plt def calculh(theta, X): h = 0 h+=theta[0]*X ...
ismail rachid's user avatar
0 votes
0 answers
102 views

I've recently implemented a neural network from scratch and am now focusing on visualizing the optimization process. Specifically, I'm interested in creating a 3D visualization of the loss landscape ...
Kris's user avatar
  • 59
0 votes
0 answers
41 views

I want to implement a neural network on pytorch where gradients are not computed over all the weights. Let's say for example I have an MLP with three layers and I want half of the nodes in the last ...
danix's user avatar
  • 155
0 votes
1 answer
19 views

If I already have the Global Minimum value for the Cost function of any model (including large language models) - would it facilitate Gradient Descent calculation? (suppose I have a quick way to ...
Drout's user avatar
  • 355
0 votes
1 answer
125 views

Say I have obtained some alphas and betas as parameters from a neural network, which will be parameters of the Beta distribution. Now, I sample from the Beta distribution and then calculate some loss ...
Jimut123's user avatar
  • 536
0 votes
0 answers
35 views

I'm quite new to ML and I'm trying to do a linear regression with quite a simple dataset: text I did two different regression, one by hand and the other one using sci kit learn, where in the latter I ...
MIKEL LASS's user avatar
0 votes
1 answer
40 views

I've programmed a linear regression model from scratch. I use the "Sum of squared residuals" as the loss function for gradient descent. For testing I use linear data (y=x) When running the ...
Blacklight's user avatar
0 votes
0 answers
22 views

I have a scenario when I am trying to optimize a vector of D dimensions. Every component of the vector is dependent on other components according to a function such as: summation over (i,j): (1-e(x_i)(...
Darkmoon Chief's user avatar
-1 votes
1 answer
75 views

I am trying to implement and train an SVM multi-class classifier from scratch using python and numpy in jupyter notebooks. I have been using the CS231n course as my base of knowledge, especially this ...
ho88it's user avatar
  • 21
1 vote
1 answer
59 views

func_model, func_params = make_functional(self.model) def fm(x, func_params): fx = func_model(func_params, x) return fx.squeeze(0).squeeze(0) def floss(...
Klae zhou's user avatar
0 votes
1 answer
90 views

I am practicing neural networks by building my own in notebooks. I am trying to check my model against an equivalent model in Keras. My model seems to work the same as other simple coded neural ...
AdamS's user avatar
  • 11
1 vote
1 answer
64 views

I'm trying to compute the second derivatives (Hessian) of a function t with respect to a tensor a using PyTorch. Below is the code I initially wrote: import torch torch.manual_seed(0) a = torch....
Ray Bern's user avatar
  • 135
0 votes
1 answer
105 views

I'm trying to recreate the cv2.warpAffine() function, taking a tensor input and output rather than a Numpy array. However, gradients calculated from the output tensor produce a Non-None gradient ...
arcanespud's user avatar
-1 votes
1 answer
58 views

In the gradient descent algorithm, I update the B and M values ​​according to their derivatives and then multiply them with the Learning rate value, but when I use the same value for L, such as 0.0001,...
Fhurky's user avatar
  • 7
1 vote
2 answers
115 views

I'm having problems with my gradient descent function. The scatter plot of my diagram shows a negative correlation but the line of best fit gotten from my gradient descent function shows a positive ...
Dubem Nwokike's user avatar
0 votes
1 answer
46 views

I'm trying to develop a model that improves the quality of a given audio. For this task I use DAC for the latent space and I run a transformer model to change the value of the latent space to improve ...
Jourdelune's user avatar
0 votes
0 answers
184 views

I was training a simple LSTM neural network with pytorch to predict stock price. And it is confusing to me that my network wouldn't fit. The loss is exploding and the r2 is negative. As the training ...
王一诺's user avatar
1 vote
0 answers
87 views

In paper Cost function dependent barren plateaus in shallow parametrized quantum circuits, the author exhibit an warm-up example in page 2 to show the barren plateau phenomenon. In this example, the ...
lang xian's user avatar
2 votes
1 answer
126 views

In a Pytorch gradient descent algorithm, the function def TShentropy(wf): unique_elements, counts = wf.unique(return_counts=True) entrsum = 0 for x in counts: p = x/len_a #...
2 False's user avatar
  • 21
1 vote
2 answers
367 views

I'm trying to find a solution for a system of linear equations using Gradient Descent Method ∥Ax-b∥^2 in Python. The linear equations are: x - 2y + 3z = - 1 3x + 2y - 5z = 3 2x - 5y + 2z = 0 The ...
Orhan94's user avatar
  • 11
2 votes
1 answer
67 views

I understand the zig-zag nature of the cost function when applying gradient descent, but what bothers me is that the cost started out at a low 300 only to increase to 1600 in the end. The cost ...
Topics on Data's user avatar
0 votes
1 answer
96 views

I'm new to machine learning and I have been learning gradient descent algorithm. I believe this code uses simultaneous update, even though it looks like sequential update. Since the values of partial ...
Mayank Gupta's user avatar
0 votes
0 answers
25 views

I'm training a deep RL model with TensorFlow, but my model doesn't have a single correct action. The output of the network is a vector [x1, x2], and both are actions that need to be optimized. def ...
gustavo lobos astorquiza's user avatar
1 vote
0 answers
53 views

I have to files PreProcess.java: /* * 4/28/24 * Final */ package Final; import java.io.DataInputStream; import java.io.FileInputStream; import java.io.FileNotFoundException; import java.io....
Mark Agib's user avatar
0 votes
1 answer
120 views

To preface, I am a complete Julia newbie... I am trying to implement PPO for the first time and I've been having issues updating the actor (and by extension critic) network parameters using the ...
Max Kim's user avatar
0 votes
1 answer
40 views

I am following a tutorial from this youtube video (https://www.youtube.com/watch?v=lCOHri09YmM), but I am getting an error "invalid value encountered in subtract coeff = coeff - der", and ...
butters149's user avatar
1 vote
0 answers
145 views

In the SGD class of pytorch, the step() method has the decorator _use_grad_for_differentiable: @_use_grad_for_differentiable def step(self, closure=None): ... Usually I would expect the no_grad ...
soap's user avatar
  • 771
0 votes
1 answer
45 views

I am unable to achieve good results unless I choose a batch size of 1. By good, I mean error decreases significantly through the epochs. When I do a full batch of 30 the results are poor, error ...
debo's user avatar
  • 380
2 votes
2 answers
267 views

I have the following input matrix inp_tensor = torch.tensor( [[0.7860, 0.1115, 0.0000, 0.6524, 0.6057, 0.3725, 0.7980, 0.0000], [1.0000, 0.1115, 0.0000, 0.6524, 0.6057, 0.3725, 0.0000, ...
Penguin's user avatar
  • 2,651
1 vote
0 answers
151 views

I am using pytorch lightning for distributed training. I am using all_gather to gather all the gradients from the gpus in order to calculate the loss function. I am unsure of what I should set the ...
JobHunter69's user avatar
  • 2,376
0 votes
1 answer
241 views

I attempted to calculate the loss between a tensor with dtype float32 and another with dtype uint8. Since the loss function performs automatic type promotion, I didn't make a type conversion ...
Aria Lovelace's user avatar
1 vote
0 answers
26 views

I'm taking an AI class and we're using hidden layers to write a descent function to predict XOR gates in Python. For this assignment specifically it only needs to have 1 hidden layer of 3 hidden nodes....
Oreo's user avatar
  • 11
1 vote
1 answer
66 views

I've tried to implement a gradient descent algorithm in Python for a machine learning problem. The dataset I'm working with has been preprocessed, and I observed an unexpected behavior in the runtime ...
H.S's user avatar
  • 23
0 votes
0 answers
87 views

I'm trying to experiment with Projected Gradient Descent on some objective functions constrained by a hypercube. "Projected" here simply means that if the next steps falls outside the ...
ufghd34's user avatar
  • 168
0 votes
1 answer
688 views

In gradient boosting different loss functions can be used. For example, in sklearn's GradientBoostingRegressor possible loss functions are: ‘squared_error’, ‘absolute_error’, ‘huber’, and ‘quantile’ ...
Sanyo Mn's user avatar
  • 441
1 vote
1 answer
115 views

I'm experimenting with running Gradient Descent (GD) on polynomials of some low degree - say 3 - with n variables, on a domain constrained by a hypercube [-1,1]^n. I want to compare the termination ...
ufghd34's user avatar
  • 168
0 votes
2 answers
59 views

part of Gradient Descent algorithm this.updateWeights = function() { let wx; let w_deriv = 0; let b_deriv = 0; for (let i = 0; i < this.points; i++) { wx = this.yArr[i] - (this....
chang dae Kim's user avatar
1 vote
0 answers
34 views

I am interested in implementing a somewhat complex custom tensorflow operation. Let's say (for the purpose of this question) that the operation is similar to performing convolution with stride=2, ...
Aviraj Bevli's user avatar
-1 votes
1 answer
272 views

To get familiar with Gradient Descent algorithm, I tried to create my own Linear Regression model. It works fine for few data points. But when try to fit it using more data, w0 and w1 are always ...
LNTR's user avatar
  • 84
-1 votes
1 answer
134 views

I'm trying to use gradient descent on a data set. What I have written is import numpy import pandas as pd import numpy as np import matplotlib.pyplot as plt data = pd.read_csv('C:/Users/Teacher/...
user124910's user avatar
0 votes
1 answer
1k views

I'm working on a machine learning project in PyTorch where I need to optimize a model using the full batch gradient descent method. The key requirement is that the optimizer should use all the data ...
Maxou's user avatar
  • 1
1 vote
0 answers
23 views

I am trying to implement a co-ordinate descent algorithm for logistic regression. My gradients are not changing, as a result I end up updating a single co-ordinate for each epoch. Here is the code: ...
Necessary_title's user avatar

1
2 3 4 5
30