0

My code:

def calc_cost_function(w, b, data):
    m = len(data)
    cost = 0
    for i in range(m):
        x = data.iloc[i].X
        y = data.iloc[i].Y
        cost += ((x * w + b) - y)**2
    return cost / (2 * m)


def gradient_descent(w_current, b_current, data, LR):
    w_gradient = 0
    b_gradient = 0
    m = len(data)

    for i in range(m):
        x = data.iloc[i].X
        y = data.iloc[i].Y
        w_gradient += (1 / m) * (x) * ((w_current * x + b_current) - y) 
        b_gradient += (1 / m) * ((w_current * x + b_current) - y) 
    
    w = w_current - LR * w_gradient
    b = b_current - LR * b_gradient
    
    return w, b
# Run gradient descent
w = 0
b = 0
LR = 0.0001
epochs = 1000
for i in range(epochs):
    w, b = gradient_descent(w, b, bigdata, LR)

print(f"final w = {w},  b = {b}")
# Then I try to plot the cost surface:

w_data = np.linspace(2, 6, 100) 
b_data = np.linspace(0, 1, 100)
w_grid, b_grid = np.meshgrid(w_data, b_data)
j_data = np.zeros_like(w_grid)

for i in range(w_grid.shape[0]):
    for j in range(w_grid.shape[1]):
        j_data[i, j] = calc_cost_function(w_grid[i, j], b_grid[i, j], bigdata)

fig = go.Figure(data=[go.Surface(
    x=w_grid,
    y=b_grid,
    z=j_data,
    colorscale="Plasma"
)])
fig.show()

What I get:

The surface looks like a 3D U shape (curved in w, almost flat in b) instead of a symmetric bowl. I thought the cost function for linear regression should always be a paraboloid bowl.

What I tried:

Narrowing/expanding the ranges of w_data and b_data.

Verified gradient descent finds reasonable parameters: w ≈ 3.93, b ≈ 0.41.

My question:

Is my plotting code wrong (meshgrid ranges, cost calculation), or is it expected that the cost surface looks like a stretched valley rather than a bowl when there’s only one feature X and one bias term b?

If it’s expected, how can I make the "bowl shape" more visible in 3D?

2
  • Well, looks like your cost function is MSE, and for linear regression that should be a quadratic form. Usually that's an ellipsoid, but if you take one axis and stretch it out, it becomes a U-shaped valley. The shape of the cost function is, if I remember correctly, a function of the covariance matrix of the data. Try generating some different random data samples -- do you get different-looking cost function plots? Commented Sep 6 at 4:57
  • @RobertDodier yes i, when i created a different sample i did get like a bulge in the bottom of the plot, as i centered the least z value i ended up with something like a bowl shape Commented Sep 6 at 5:10

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.