0
$\begingroup$

I have a Pandas dataframe with 10 columns, 9 of which are features to be used to predict the 10th column.

How is it ossible to convert this Pandas dataframe into X and y vectors to use in a linear regression problem?

$\endgroup$

2 Answers 2

1
$\begingroup$

If you have your dataframe loaded as the variable df, you can simply use this

X = df[['A','B','C']]
y = df['Z']

where A, B and C are your independent variables and Z is your dependent variable.

$\endgroup$
2
  • $\begingroup$ is that possible to use X and y for train_test_split further? $\endgroup$ Commented Sep 2, 2019 at 7:39
  • $\begingroup$ Yes, it is possible. Please see sklearn's documentation for more details . Have a look here scikit-learn.org/stable/modules/generated/… $\endgroup$ Commented Sep 2, 2019 at 7:41
0
$\begingroup$

Are you looking for this?

#format the data as a numpy array to feed into the algorithm
X = np.asarray([np.asarray(df['Ind1']),np.asarray(df['Ind2']),np.asarray(df['Ind3'])])
y = np.asarray([np.asarray(df['Dep'])])

Or, simplified.

# array(['a', 'b', 'c'], dtype=object)
arr = df.index.to_numpy()

#  array([1, 2, 3])
arr = df['A'].to_numpy()
$\endgroup$

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.