2

I am working on a LinearRegression model to fill the null values for the feature Rupeepersqft. When I run the code, I am receiving this error:

IndexError                                Traceback (most recent call last)
<ipython-input-20-33d4e6d2998e> in <module>()
      1 test_data = data_with_null.iloc[:,:3]
----> 2 Rupeepersqft_predicted['Rupeepersqft'] = pd.DataFrame(linreg.predict(test_data))

IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices

This is the code which gives me the error:

from sklearn.linear_model import LinearRegression
linreg = LinearRegression()

data_with_null = data2[['Price (Lakhs)','Area','Area Type','Rupeepersqft','Condition','Purchase Type','Real Estate Regulation Act']].dropna()
data_without_null =  data_with_null.dropna()

train_data_x = data_without_null.iloc[:,:3]
train_data_y = data_without_null.iloc[:,3]

linreg.fit(train_data_x, train_data_y)

test_data = data_with_null.iloc[:,:3]
Rupeepersqft_predicted['Rupeepersqft'] = pd.DataFrame(linreg.predict(test_data))

data_with_null.Rupeepersqft.fillna(Rupeepersqft_predicted, inplace=True)

This is how the data looks like:

Data2

Can anyone help me out with this?

1
  • Rupeepersqft_predicted['Rupeepersqft'] would be ok if it is selecting a column from a dataframe (of value from a dict). But apparently it is an array, which doesn't accept string indexing (unless it's a structured array). Commented Nov 18, 2021 at 19:50

1 Answer 1

2

To assign values to a column in Pandas.DataFrame you should use the locators, i.e., loc and iloc (for array-like manipulations), so to fix your issue try changing the

Rupeepersqft_predicted['Rupeepersqft'] = pd.DataFrame(linreg.predict(test_data))

to:

Rupeepersqft_predicted.loc[:, 'Rupeepersqft'] = pd.DataFrame(linreg.predict(test_data))

which will chose all the rows (the :), and the column Rupeepersqft, and assign whatever values you have on the right.

or by using the iloc:

Rupeepersqft_predicted.iloc[:, 1] = pd.DataFrame(linreg.predict(test_data))

to assign it to the all rows (again by : operator) of the 1st column of the DataFrame.

Just make sure the values on the right are of the same length as the column you try to assign it to.

More on Pandas you can find in this book.

Cheers

Sign up to request clarification or add additional context in comments.

1 Comment

You are welcome, and I'm really happy it helped you. Please consider accepting this answer (by clicking the "v" next to the answer) to let other people, which encounter the same issue, know that it solves it. Cheers mate.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.