2

It seems that the following code finds the gradient descent correctly:

def gradientDescent(x, y, theta, alpha, m, numIterations):
    xTrans = x.transpose()
    for i in range(0, numIterations):
        hypothesis = np.dot(x, theta)
        loss = hypothesis - y 
        cost = np.sum(loss ** 2) / (2 * m)
        print("Iteration %d | Cost: %f" % (i, cost))
        # avg gradient per example
        gradient = np.dot(xTrans, loss) / m 
        # update
        theta = theta - alpha * gradient
    return theta

Now suppose we have the following sample data:

enter image description here

For the 1st row of sample data, we will have: x = [2104, 5, 1, 45], theta = [1,1,1,1], y = 460. However, we are nowhere specifying in the lines :

hypothesis = np.dot(x, theta)
loss = hypothesis - y

which row of the sample data to consider. Then how come this code is working fine ?

3 Answers 3

3

First: Congrats on taking the course on Machine Learning on Coursera! :)

hypothesis = np.dot(x,theta) will compute the hypothesis for all x(i) at the same time, saving each h_theta(x(i)) as a row of hypothesis. So there is no need to reference a single row.

Same is true for loss = hypothesis - y.

Sign up to request clarification or add additional context in comments.

2 Comments

Does this mean that x is an m*n dimensional matrix (m = no. of sample data and n = number of features) and y an m*1 matrix ?
With some cautioness: Yes! Is it possible to debug your code and take a closer look at x and y? If so, try and see yourself. I presume it is, because if x and z are not m*n or m*1, then gradient descent, as defined in this function, would not make any sense.
2

This looks like a slide from Andrew Ng's excellent Machine Learning course!

The code works because you're using matrix types (from the numpy library?), and the basic operators (+, -, *, /) have been overloaded to perform matrix arithmetic - therefore you don't need to iterate over each row.

Comments

0

a hypothesis y is represented by y = w0 + w1*x1 + w2*x2 + w3*x3 + ...... wn*xn where w0 is the intercept. How is the intercept figured out in hypothesis formula abose in np.dot(x, theta)

I am assuming X = data representing features. and theta can be an array like [1,1,1.,, ] of rowSize(data)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.