1

I have a DataFrame df = pd.DataFrame({'col1': ["a","b","c","d","e", "f","g","h"], 'col2': [1,1,1,2,2,3,3,3]}) that looks like

Input:

 col1 col2
0   a   1
1   b   1
2   c   1
3   d   2
4   e   2
5   f   3
6   g   3
7   h   3

I want to drop the last row bases off of grouping "col2" which would look like...

Expected Output:

 col1 col2
0   a   1
1   b   1
3   d   2
5   f   3
6   g   3

I wrote df.groupby('col2').tail(1) which gets me what I want to delete but when I try to write df.drop(df.groupby('col2').tail(1)) I get an axis error. What would be a solution to this

2 Answers 2

2

Look like duplicated would work:

df[df.duplicated('col2', keep='last') | 
   (~df.duplicated('col2', keep=False))  # this is to keep all single-row groups
  ]

Or with your approach, you should drop the index:

# this would also drop all single-row groups
df.drop(df.groupby('col2').tail(1).index)

Output:

  col1  col2
0    a     1
1    b     1
3    d     2
5    f     3
6    g     3
Sign up to request clarification or add additional context in comments.

Comments

1

try this:

df.groupby('col2', as_index=False).apply(lambda x: x.iloc[:-1,:]).reset_index(drop=True)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.