I am trying to implement logistic regression but I am receiving wrong plot.
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn import cross_validation
from sklearn.linear_model import LogisticRegression
sns.set()
x = (np.random.randint(2000, size=400)).reshape((400,1))
y = (np.random.randint(2, size=400)).reshape((400,1)).ravel()
x_train, x_test, y_train, y_test = cross_validation.train_test_split(x, y, test_size=0.4, random_state=0)
logistic_regr = LogisticRegression()
logistic_regr.fit(x_train, y_train)
fig, ax = plt.subplots()
ax.set(xlabel='x', ylabel='y')
ax.plot(x_test, logistic_regr.predict_proba(x_test), label='Logistic regr')
#ax.plot(x_test,logistic_regr.predict(x_test), label='Logistic regr')
ax.legend()
And I am receiving the following plot:
If I use:
ax.plot(x_test,logistic_regr.predict(x_test), label='Logistic regr')
I am receiving:



0, that's why you are having this plot. Your training data is completely random and your target is only made of0and1and you want it to be a linear regression. So the regression is a line and it predicts either always 0 or always 1.np.linspace(0,1,400).ravel()it throwsUnknown label type0or1. Not values in between.np.random.randint( )returns only integerslogistic_regr.predict_probashould't it find a probability between [0,1] ? Regardless of my target?