How to apply linear regression with fixed x intercept in python?

Question

I've found quite a few examples of fitting a linear regression with zero intercept.

However, I would like to fit a linear regression with a fixed x-intercept. In other words, the regression will start at a specific x.

I have the following code for plotting.

import numpy as np
import matplotlib.pyplot as plt

xs = np.array([0.1, 0.2, 0.4, 0.6, 0.8, 1.0, 2.0, 4.0, 6.0, 8.0, 10.0,
              20.0, 40.0, 60.0, 80.0])


ys = np.array([0.50505332505407008, 1.1207373784533172, 2.1981844719020001,
              3.1746209003398689, 4.2905482471260044, 6.2816226678076958,
              11.073788414382639, 23.248479770546009, 32.120462301367183,
              44.036117671229206, 54.009003143831116, 102.7077685684846,
              185.72880217806673, 256.12183145545811, 301.97120103079675])


def best_fit_slope_and_intercept(xs, ys):
    # m = xs.dot(ys)/xs.dot(xs)
    m = (((np.average(xs)*np.average(ys)) - np.average(xs*ys)) /
         ((np.average(xs)*np.average(xs)) - np.average(xs*xs)))
    b = np.average(ys) - m*np.average(xs)
    return m, b


def rSquaredValue(ys_orig, ys_line):
    def sqrdError(ys_orig, ys_line):
        return np.sum((ys_line - ys_orig) * (ys_line - ys_orig))
    yMeanLine = np.average(ys_orig)
    sqrtErrorRegr = sqrdError(ys_orig, ys_line)
    sqrtErrorYMean = sqrdError(ys_orig, yMeanLine)
    return 1 - (sqrtErrorRegr/sqrtErrorYMean)


m, b = best_fit_slope_and_intercept(xs, ys)
regression_line = m*xs+b

r_squared = rSquaredValue(ys, regression_line)
print(r_squared)

plt.plot(xs, ys, 'bo')
# Normal best fit
plt.plot(xs, m*xs+b, 'r-')
# Zero intercept
plt.plot(xs, m*xs, 'g-')
plt.show()

And I want something like the follwing where the regression line starts at (5, 0).

Thank You. Any and all help is appreciated.

Jordi Fuentes · Accepted Answer · 2019-12-26 23:02:06Z

1

I been thinking for some time and I've found a possible workaround to the problem.

If I understood well, you want to find slope and intercept of the linear regression model with a fixed x-axis intercept.

Providing that's the case (imagine you want the x-axis intercept to take the value forced_intercept), it's as if you "moved" all the points -forced_intercept times in the x-axis, and then you forced scikit-learn to use y-axis intercept equal 0. You would then have the slope. To find the intercept just isolate b from y=ax+b and force the point (forced_intercept,0). When you do that, you get to b=-a*forced_intercept (where a is the slope). In code (notice xs reshaping):

import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression

xs = np.array([0.1, 0.2, 0.4, 0.6, 0.8, 1.0, 2.0, 4.0, 6.0, 8.0, 10.0,
              20.0, 40.0, 60.0, 80.0]).reshape((-1,1)) #notice you must reshape your array or you will get a ValueError error from NumPy.


ys = np.array([0.50505332505407008, 1.1207373784533172, 2.1981844719020001,
              3.1746209003398689, 4.2905482471260044, 6.2816226678076958,
              11.073788414382639, 23.248479770546009, 32.120462301367183,
              44.036117671229206, 54.009003143831116, 102.7077685684846,
              185.72880217806673, 256.12183145545811, 301.97120103079675])

forced_intercept = 5 #as you provided in your example of (5,0)

new_xs = xs - forced_intercept #here we "move" all the points
model = LinearRegression(fit_intercept=False).fit(new_xs, ys) #force an intercept of 0
r = model.score(new_xs,ys)
a = model.coef_

b = -1 * a * forced_intercept #here we find the slope so that the line contains (forced intercept,0)

print(r,a,b)
plt.plot(xs,ys,'o')
plt.plot(xs,a*xs+b)
plt.show()

Hope this is what you were looking for.

edited Dec 26, 2019 at 23:02

answered Dec 26, 2019 at 22:57

Jordi Fuentes

476 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Abrian Abir Over a year ago

Thank you for the response. Just for my understanding, if we start at (5,0) shouldn't the line be more lenient to left because there are more data points to the left. The code above only seems to have shifted the linear regression to right.

Abrian Abir Over a year ago

Also the linear regression fails with the following dataset: xs = np.array([0.1, 0.2, 0.4, 0.6, 0.8, 1.0, 2.0, 4.0, 6.0, 8.0, 10.0, 11.0, 12.0, 13.0, 30.0]).reshape((-1, 1)) and ys = np.array([20., 25., 10., 3., 300., 6., 200., 210., 220., 230., 240., 250., 300., 310., 320.])

Jordi Fuentes Over a year ago

It's not that we just shifted, we shifted and forced the regression to go over the origin (which was (forced_intercept,0)), and then "unshifted". Just shifting would be what we've done without forcing it to go through the origin (fit_intercept = True).

Abrian Abir Over a year ago

I understand and that's not exactly what I meant, but please have a look at the dataset of the previous comment - The line starts at the right place but isn't affected by the weights of the data.

Jordi Fuentes Over a year ago

Yes, I see the probem. Sorry I didn't answer I was thinking about it. I understand what you mean and you have all the reason. If I find what fails I'll tell you.

Alexandr Abramov · Accepted Answer · 2019-12-26 23:25:32Z

1

May be this approach will be useful.

import numpy as np
import matplotlib.pyplot as plt

xs = np.array([0.1, 0.2, 0.4, 0.6, 0.8, 1.0, 2.0, 4.0, 6.0, 8.0, 10.0,
              20.0, 40.0, 60.0, 80.0])

ys = np.array([0.50505332505407008, 1.1207373784533172, 2.1981844719020001,
              3.1746209003398689, 4.2905482471260044, 6.2816226678076958,
              11.073788414382639, 23.248479770546009, 32.120462301367183,
              44.036117671229206, 54.009003143831116, 102.7077685684846,
              185.72880217806673, 256.12183145545811, 301.97120103079675])

# At first we add this anchor point to the points set.
xs = np.append(xs, [5.])
ys = np.append(ys, [0.])

# Then we prepare the coefficient matrix according docs
# https://docs.scipy.org/doc/numpy/reference/generated/numpy.linalg.lstsq.html
A = np.vstack([xs, np.ones(len(xs))]).T

# Then we prepare weights for these points. And we put all weights
# equal except the last one (for added anchor point).
# In this example it's weight 1000 times larger in comparison with others.
W = np.diag(np.ones([len(xs)]))
W[-1,-1] = 1000.

# And we find least-squares solution.
m, c = np.linalg.lstsq(np.dot(W, A), np.dot(W, ys), rcond=None)[0]

plt.plot(xs, ys, 'o', label='Original data', markersize=10)
plt.plot(xs, m * xs + c, 'r', label='Fitted line')
plt.show()

answered Dec 26, 2019 at 23:25

Alexandr Abramov

1561 silver badge8 bronze badges

3 Comments

Abrian Abir Over a year ago

Thank you for the response. Just for my understanding, if we start at (5,0) shouldn't the line be more lenient to left because there is more data points to the left.

Abrian Abir Over a year ago

The code above only seems to have shifted the linear regression to right.

Abrian Abir Over a year ago

Also the linear regresssion fails with the following dataset: xs = np.array([0.1, 0.2, 0.4, 0.6, 0.8, 1.0, 2.0, 4.0, 6.0, 8.0, 10.0, 11.0, 12.0, 13.0, 30.0]) and ys = np.array([20., 25., 10., 3., 300., 6., 200., 210., 220., 230., 240., 250., 300., 310., 320.])

NotAName · Accepted Answer · 2019-12-26 22:16:18Z

0

If you used scikit-learn for linear regression task, it's possible to define intercept(s) using intercept_ attribute.

answered Dec 26, 2019 at 22:16

NotAName

4,4744 gold badges39 silver badges59 bronze badges

Comments

Suuuehgi · Accepted Answer · 2023-04-02 00:31:04Z

0

from matplotlib import pyplot as plt
import numpy as np
from scipy.optimize import curve_fit

X = np.linspace(0,10, 100)
Y = X + np.random.randn(100) + 3.5

lin = lambda x, a: a * x + 3.5
slope = curve_fit(lin, X, Y)[0][0]

plt.plot(X, Y, X, [slope * x + 3.5 for x in X])

answered Apr 2, 2023 at 0:31

Suuuehgi

5,0884 gold badges33 silver badges35 bronze badges

Collectives™ on Stack Overflow

How to apply linear regression with fixed x intercept in python?

4 Answers 4

5 Comments

3 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

5 Comments

3 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related