How to update values in pandas dataframe in a for loop?

Question

I am trying to make a data frame that can store variable coeff value after each iteration. I am able to plot the graph after each iteration. but when I tried to insert the value in the data frame after each iteration.

I am getting this error.

None of [Int64Index([ 3169, 3170, 3171, 3172, 3173, 3174, 3175, 3176, 3177,\n 3178,\n ...\n 31671, 31672, 31673, 31674, 31675, 31676, 31677, 31678, 31679,\n
31680],\n dtype='int64', length=28512)] are in the [columns]

This is the code I use:

from sklearn.model_selection import KFold

kf = KFold(n_splits=10)
cvlasso= Lasso(alpha=0.001)
count = 1

var = pd.DataFrame()


for train, _ in kf.split(X, Y):
    cvlasso.fit(X.iloc[train, :], Y.iloc[train])
    importances_index_desc = cvlasso.coef_
    feature_labels = list(X.columns.values)
    importance = pd.Series(importances_index_desc, feature_labels)
    plt.figure()
    plt.bar(feature_labels, importances_index_desc)
    plt.xticks(feature_labels, rotation='vertical')
    plt.ylabel('Importance')
    plt.xlabel('Features')
    plt.title('Fold {}'.format(count))
    count = count + 1
    var[train] = importances_index_desc

plt.show()

and one more thing there is a total of 33000 observations in my dataset but at the end of the loop, the train value is 28512? Does anyone know why train value is not 33000?

loginmind · Accepted Answer · 2020-02-09 06:03:50Z

1

train is the list of index of train data returned from KFold. You put train as accessing column in var[train] that will cause the error because none of index value is a DataFrame column .

IMO, setting complicated value as index is not good idea, just use simple value as index, for example

var.loc[count] = importances_index_desc
count += 1

edited Feb 9, 2020 at 6:03

answered Feb 9, 2020 at 5:49

loginmind

6036 silver badges11 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Ramsha Siddiqui · Accepted Answer · 2020-02-09 06:51:45Z

0

Another solution could be using pandas.DataFrame.append(pandas.DataFrame):

important_index_desc = pd.DataFrame(important_index_desc)
var = var.append(important_index_desc)

Let me know if this helps!

answered Feb 9, 2020 at 6:51

Ramsha Siddiqui

4806 silver badges21 bronze badges

4 Comments

Warzone Over a year ago

and one more thing there is a total of 33000 observations in my dataset but at the end of the loop, the train value is 28512? anyone know why train value is not 33000 ?

Ramsha Siddiqui Over a year ago

its possible that duplicate rows are removed on .append() - try .append(ignore_index=True)

Warzone Over a year ago

yes but the train value starts from 0 right and move all the way up to the number of observation in the dataset. i.e. 31681, but my query is why it is 28512

Ramsha Siddiqui Over a year ago

from the error cited above in your question - it looks like it starts from 3169. it should start from zero. I think kf.split() gives a single value - like for train in kf.split(X, Y)

hiranyajaya · Accepted Answer · 2020-02-10 00:31:42Z

0

Try the following.

Instead of,

var = pd.DataFrame()

Create a dataframe with heading

var = pd.DataFrame(columns=['impt_idx_desc'])

Then in the loop use the 'loc' function as,

var.loc[count] = [importances_index_desc]

where count is increased by +1 in the loop.

edited Feb 10, 2020 at 0:31

answered Feb 9, 2020 at 5:44

hiranyajaya

5694 silver badges13 bronze badges

4 Comments

loginmind Over a year ago

Hi, use train as access index here will cause an error. You can check this for more detail: pandas.pydata.org/pandas-docs/stable/reference/api/…

hiranyajaya Over a year ago

sorry, I overlooked the data type returned. Editing the answer now. Thanks!

Warzone Over a year ago

still i am getting an error . Error = cannot set a row with mismatched columns

hiranyajaya Over a year ago

try var.loc[count] = [importances_index_desc]

Collectives™ on Stack Overflow

How to update values in pandas dataframe in a for loop?

3 Answers 3

Comments

4 Comments

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

4 Comments

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related