0

I'm trying to implement a gradient descent algorithm that was previously written in matlab in python with numpy, but I'm getting a set of similar but different results.

Here's the matlab code

function [theta] = gradientDescentMulti(X, y, theta, alpha, num_iters)

m = length(y);
num_features = size(X,2);
for iter = 1:num_iters;
    temp_theta = theta;
    for i = 1:num_features
        temp_theta(i) = theta(i)-((alpha/m)*(X * theta - y)'*X(:,i));
    end
    theta = temp_theta;
end


end

and my python version

def gradient_descent(X,y, alpha, trials):

    m = X.shape[0]
    n = X.shape[1]
    theta = np.zeros((n, 1))

    for i in range(trials):

        temp_theta = theta
        for p in range(n):
            thetaX = np.dot(X, theta)
            tMinY = thetaX-y
            temp_theta[p] = temp_theta[p]-(alpha/m)*np.dot(tMinY.T, X[:,p:p+1])

        theta = temp_theta

    return theta

Test case and results in matlab

X = [1 2 1 3; 1 7 1 9; 1 1 8 1; 1 3 7 4]
y = [2 ; 5 ; 5 ; 6];
[theta] = gradientDescentMulti(X, y, zeros(4,1), 0.01, 1);

theta =

    0.0450
    0.1550
    0.2225
    0.2000

test case and result in python

test_X = np.array([[1,2,1,3],[1,7,1,9],[1,1,8,1],[1,3,7,4]])
test_y = np.array([[2], [5], [5], [6]])
theta, cost = gradient_descent(test_X, test_y, 0.01, 1)
print theta
>>[[ 0.045     ]
  [ 0.1535375 ]
  [ 0.20600144]
  [ 0.14189214]]
2
  • @Kartik "MATLAB results might be wrong" is really not a helpful suggestion, without a detailed reason. Commented Aug 13, 2016 at 3:35
  • My comment was misunderstood. I was sharing my experience with you, and I was suggesting that you could use another software to resolve this, if you can. (I am aware that you may not have access to another software.) Explaining why MATLAB results were wrong when I tried a simple histogram is something I didn't have the ken to research or figure out at that time. I blamed it on the closed source nature of MATLAB, and presumed something slipped through their tests, and continued on with Python, which I felt much more "at home" using. Commented Aug 13, 2016 at 3:55

1 Answer 1

8

This line in your Python:

    temp_theta = theta

doesn't do what you think it does. It doesn't make a copy of theta and "assign" it to the "variable" temp_theta -- it just says "temp_theta is now a new name for the object currently named by theta".

So when you modify temp_theta here:

        temp_theta[p] = temp_theta[p]-(alpha/m)*np.dot(tMinY.T, X[:,p:p+1])

You're actually modifying theta -- because there's only the one array, now with two names.

If you instead write

    temp_theta = theta.copy()

you'll get something like

(3.5) dsm@notebook:~/coding$ python peter.py
[[ 0.045 ]
 [ 0.155 ]
 [ 0.2225]
 [ 0.2   ]]

which matches your Matlab results.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.