pandas drop last group element

Question

I have a DataFrame df = pd.DataFrame({'col1': ["a","b","c","d","e", "f","g","h"], 'col2': [1,1,1,2,2,3,3,3]}) that looks like

Input:

I want to drop the last row bases off of grouping "col2" which would look like...

Expected Output:

I wrote df.groupby('col2').tail(1) which gets me what I want to delete but when I try to write df.drop(df.groupby('col2').tail(1)) I get an axis error. What would be a solution to this

Quang Hoang · Accepted Answer · 2020-08-31 19:15:07Z

2

Look like duplicated would work:

df[df.duplicated('col2', keep='last') | 
   (~df.duplicated('col2', keep=False))  # this is to keep all single-row groups
  ]

Or with your approach, you should drop the index:

# this would also drop all single-row groups
df.drop(df.groupby('col2').tail(1).index)

Output:

  col1  col2
0    a     1
1    b     1
3    d     2
5    f     3
6    g     3

answered Aug 31, 2020 at 19:15

Quang Hoang

151k11 gold badges64 silver badges86 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Reza · Accepted Answer · 2020-08-31 19:15:53Z

1

try this:

df.groupby('col2', as_index=False).apply(lambda x: x.iloc[:-1,:]).reset_index(drop=True)

answered Aug 31, 2020 at 19:15

Reza

2,0551 gold badge11 silver badges17 bronze badges

Collectives™ on Stack Overflow

pandas drop last group element

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related