I'm trying to implement a simple linear regression algo, and for that I've written two functions:
- Cost function
- Gradient descent
Cost function appears to work normally, as in it's not giving me unusually large or unusually small numbers.
Whereas the gradient descent function is making the parameters suddenly jump to ridiculous numbers.
Cost function code:
m = len(x) # Training examples
def cost(w, b, x, y):
j_wb = 0.
for i in range(m):
f_wb = w * x[i] + b
err = f_wb - y[i]
j_wb += err**2
j_wb = j_wb / (m * 2)
return j_wb
Here's a test of it:
at iteration 0: COST = 0.0
at iteration 1: COST = 11.906281776347848
at iteration 2: COST = 24.406303301163156
at iteration 3: COST = 25.89142657618535
at iteration 4: COST = 31.71690104577324
at iteration 5: COST = 32.222444954452776
at iteration 6: COST = 52.79887560513525
at iteration 7: COST = 57.723294484239304
at iteration 8: COST = 59.252477506721256
at iteration 9: COST = 61.178601048944415
Final cost: 4.5292025547813575
Gradient decent function code:
def gradient(w, b, x, y, iterations, alphar):
# Initialize
dj_dw = 0
dj_db = 0
# Gradient descent
for i in range(iterations):
j_wb = cost(w, b, x, y)
dj_dw = j_wb * x[i]
dj_db = j_wb
w = w - alphar * dj_dw
b = b - alphar * dj_db
return dj_dw, dj_db
And now here is where it returns the main problem:
> Iteration 0 || Cost = 4.5292025547813575 || w = -0.08700861314752584
> Iteration 1 || Cost = 1919.5314293706836 || w = -959.8527232984893
> Iteration 2 || Cost = 1540639463.935084 || w = -231096879.44298592
> Iteration 3 || Cost = 8.924767986122691e+19 || w = -3.3914118347497325e+19
> Iteration 4 || Cost = 1.9197504655865147e+42 || w = -1.6701829050602676e+42
> Iteration 5 || Cost = 4.653919574489622e+87 || w = -1.675411046816264e+87
> Iteration 6 || Cost = 4.685387293902403e+177 || w = -5.622464752682884e+176
> Iteration 7 || Cost = inf || w = -inf
> Iteration 8 || Cost = nan || w = nan
> Iteration 9 || Cost = nan || w = nan
> Iteration 10 || Cost = nan || w = nan
Here are the first 10 iterations. On iteration 2 the cost suddenly shoots up to ~2000 and at iteration 3 the parameter w is at a very low number.
Then it climaxes at iteration 7 where they're both infinity.
Can someone please guide me to where I'm going wrong? Any help or advice will be greatly appreciated