1

I'm trying to fit a parabola into a simple generated dataset using linear regression, however no matter what I do the curve I get straight out of the model turns out to be an incomprehensible mess.

import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression

#xtrain, ytrain datasets have been generated earlier

model = LinearRegression(fit_intercept = True)
model.fit(np.hstack([xtrain, xtrain**2]), ytrain)  
xfit = np.linspace(-3,3,20)  
yfit = model.predict(np.hstack([xtrain, xtrain**2]))
plt.plot(xfit, yfit)
plt.scatter(xtrain, ytrain, color="black")

This code outputs following graph:

Code output

However, when I manually generate the plot from the coefficients that the model produces by simply changing in the following line of code, I get exactly the result I want.

Manual output

yfit = model.coef_[0]*xfit + model.coef_[1]*xfit**2 + model.intercept_

This seems like a bit of a clunky way of going about things so I'd like to learn how to generate the curve properly. I think the issue must be the discrete nature of my data but I haven't been able to figure it out on my own.

1
  • there's a type.. it should be model.predict(np.hstack([xfit, xfit**2])) Commented Nov 8, 2020 at 15:04

1 Answer 1

1

Here is your bug fixed:

yfit = model.predict(np.hstack([xfit, xfit**2]))

In your code you are plotting xfit values on X-axis while f(xtrain) on Y-axis.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.