0

Here is my scenarion.

data = [[25593.14, 39426.66],
        [98411.00, 81869.75],
        [71498.80, 62495.80],
        [38068.00, 54774.00],
        [58188.00, 43453.65],
        [10220.00, 18465.25]]

About data is my data model.

x-cordinates refers "Salary" y-cordinates refers "Expenses"

I want to predict the expense when I give "Salary" i.e., X-coordinate.

Here is my sample code. Please help me out.

from sklearn.linear_model import LinearRegression

data = [[25593.14, 39426.66],
        [98411.00, 81869.75],
        [71498.80, 62495.80],
        [38068.00, 54774.00],
        [58188.00, 43453.65],
        [10220.00, 18465.25]]

salary=[]
expenses=[]

for dataset in data:
    # import pdb; pdb.set_trace()
    salary.append(dataset[0])
    expenses.append(dataset[1])

model = LinearRegression()
model.fit(salary, expenses)
prediction = model.predict([10200.00])
print(prediction)

Error which I got:

ValueError: Expected 2D array, got 1D array instead:
array=[ 25593.14  98411.    71498.8   38068.    58188.    10220.  ].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample

.

4
  • and what is the problem? Commented Feb 20, 2018 at 17:29
  • Edited my question Commented Feb 20, 2018 at 17:31
  • The error tells you what to do and what the problem is (and there are probably 5 other questions here in regards to this error). I highly recommend reading sklearn's docs to see what shapes are expected. After that, read some numpy docs to not do that list-append stuff you are doing! Commented Feb 20, 2018 at 17:33
  • To add to the above, the first argument of this line: model.fit(salary, expenses) is where the error is occurring, it expects a matrix of training data for the first argument, "X". This may help Commented Feb 20, 2018 at 17:36

2 Answers 2

4

As suggested by the comments, something like this would be a better way to work with data you want to feed into a scikit learn model. Another example can be seen here.

from sklearn.linear_model import LinearRegression
import numpy as np

data = np.array(
        [[25593.14, 39426.66],
        [98411.00, 81869.75],
        [71498.80, 62495.80],
        [38068.00, 54774.00],
        [58188.00, 43453.65],
        [10220.00, 18465.25]]
).T

salary = data[0].reshape(-1, 1)
expenses = data[1]

model = LinearRegression()
model.fit(salary, expenses)
prediction = model.predict(np.array([10200.00]).reshape(-1, 1))
print(prediction)
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for your help
0

quick fix, replace this line

model.fit(np.array([salary]), np.array([expenses]))

X is expected to be an array of arrays, array([arr1,arr2,array3,...]) same of arr1 and arr2 being arrays of at least one feature, same for y,it should be an array of containing a list of values array[label1,label2,label3,...]

1 Comment

Thanks for your help.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.