Understanding Gradient Descent for Multivariate Linear Regression python implementation

Question

It seems that the following code finds the gradient descent correctly:

def gradientDescent(x, y, theta, alpha, m, numIterations):
    xTrans = x.transpose()
    for i in range(0, numIterations):
        hypothesis = np.dot(x, theta)
        loss = hypothesis - y 
        cost = np.sum(loss ** 2) / (2 * m)
        print("Iteration %d | Cost: %f" % (i, cost))
        # avg gradient per example
        gradient = np.dot(xTrans, loss) / m 
        # update
        theta = theta - alpha * gradient
    return theta

Now suppose we have the following sample data:

For the 1st row of sample data, we will have: x = [2104, 5, 1, 45], theta = [1,1,1,1], y = 460. However, we are nowhere specifying in the lines :

hypothesis = np.dot(x, theta)
loss = hypothesis - y

which row of the sample data to consider. Then how come this code is working fine ?

Lorenz Merdian · Accepted Answer · 2015-11-10 12:37:16Z

3

First: Congrats on taking the course on Machine Learning on Coursera! :)

hypothesis = np.dot(x,theta) will compute the hypothesis for all x(i) at the same time, saving each h_theta(x(i)) as a row of hypothesis. So there is no need to reference a single row.

Same is true for loss = hypothesis - y.

answered Nov 10, 2015 at 12:37

Lorenz Merdian

7524 silver badges15 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Saurabh Verma Over a year ago

Does this mean that x is an m*n dimensional matrix (m = no. of sample data and n = number of features) and y an m*1 matrix ?

Lorenz Merdian Over a year ago

With some cautioness: Yes! Is it possible to debug your code and take a closer look at x and y? If so, try and see yourself. I presume it is, because if x and z are not m*n or m*1, then gradient descent, as defined in this function, would not make any sense.

Scottie · Accepted Answer · 2015-11-10 12:42:26Z

2

This looks like a slide from Andrew Ng's excellent Machine Learning course!

The code works because you're using matrix types (from the numpy library?), and the basic operators (+, -, *, /) have been overloaded to perform matrix arithmetic - therefore you don't need to iterate over each row.

answered Nov 10, 2015 at 12:42

Scottie

5454 silver badges14 bronze badges

Comments

ankit kumar Singh · Accepted Answer · 2020-04-30 20:31:08Z

0

a hypothesis y is represented by y = w0 + w1*x1 + w2*x2 + w3*x3 + ...... wn*xn where w0 is the intercept. How is the intercept figured out in hypothesis formula abose in np.dot(x, theta)

I am assuming X = data representing features. and theta can be an array like [1,1,1.,, ] of rowSize(data)

answered Apr 30, 2020 at 20:31

ankit kumar Singh

1

Collectives™ on Stack Overflow

Understanding Gradient Descent for Multivariate Linear Regression python implementation

3 Answers 3

2 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related