2

I have a Pandas Dataframe which tells me monthly sales of items in shops df.head():

    ID      month   sold
0   150983  0       1.0
1   56520   0       13.0
2   56520   1       7.0
3   56520   2       13.0
4   56520   3       8.0

I want to remove all IDs where there were no sales last month. I.e. month == 33 & sold == 0. Doing the following

unwanted_df = df[((df['month'] == 33) & (df['sold'] == 0.0))]

I just get 46 rows, which is far too little. But nevermind, I would like to have the data in different format anyway. Pivoted version of above table is just what I want:

pivoted_df = df.pivot(index='month', columns = 'ID', values = 'sold').fillna(0)
pivoted_df.head()

ID  0   2   3   5   6   7   8   10  11  12  ... 214182  214185  214187  214190  214191  214192  214193  214195  214197  214199
month                                                                                   
0   0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     1.0     0.0
1   0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0
2   0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 ... 0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0
3   0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0
4   0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 ... 0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0

Question. How to remove columns with the value 0 in the last row in pivoted_df?

2 Answers 2

3

You can do this with one line:

pivoted_df= pivoted_df.drop(pivoted_df.columns[pivoted_df.iloc[-1,:]==0],axis=1)
Sign up to request clarification or add additional context in comments.

Comments

0

I want to remove all IDs where there were no sales last month

You can first calculate the IDs satisfying your condition:

id_selected = df.loc[(df['month'] == 33) & (df['sold'] == 0), 'ID']

Then filter these from your dataframe via a Boolean mask:

df = df[~df['ID'].isin(id_selected)]

Finally, use pd.pivot_table with your filtered dataframe.

1 Comment

I also assume that if there is no data on sales last month, it means that there weren't any. Thus it is most natural to work on pivoted dataframe.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.