I am trying to make a data frame that can store variable coeff value after each iteration. I am able to plot the graph after each iteration. but when I tried to insert the value in the data frame after each iteration.
I am getting this error.
None of [Int64Index([ 3169, 3170, 3171, 3172, 3173, 3174, 3175, 3176, 3177,\n 3178,\n ...\n 31671, 31672, 31673, 31674, 31675, 31676, 31677, 31678, 31679,\n
31680],\n dtype='int64', length=28512)] are in the [columns]
This is the code I use:
from sklearn.model_selection import KFold
kf = KFold(n_splits=10)
cvlasso= Lasso(alpha=0.001)
count = 1
var = pd.DataFrame()
for train, _ in kf.split(X, Y):
cvlasso.fit(X.iloc[train, :], Y.iloc[train])
importances_index_desc = cvlasso.coef_
feature_labels = list(X.columns.values)
importance = pd.Series(importances_index_desc, feature_labels)
plt.figure()
plt.bar(feature_labels, importances_index_desc)
plt.xticks(feature_labels, rotation='vertical')
plt.ylabel('Importance')
plt.xlabel('Features')
plt.title('Fold {}'.format(count))
count = count + 1
var[train] = importances_index_desc
plt.show()
and one more thing there is a total of 33000 observations in my dataset but at the end of the loop, the train value is 28512? Does anyone know why train value is not 33000?