0

I have a df with two columns name and score. I'm trying to keep only the rows for each user where score > 1 starts.

    df

            name       score     
    0       bruno        0         
    1       bruno        0         
    2       bruno        15        
    3       bruno        0        
    4       paul         0         
    5       paul         0          
    6       paul         25
    7       paul         0
    8       paul         10
    9       marcus        5
    10      mason         0

final df

            name       score            
    2       bruno        15        
    3       bruno        0                 
    6       paul         25
    7       paul         0
    8       paul         10
    9       marcus        5 

1 Answer 1

2
x = df[df.groupby("name")["score"].cumsum().gt(0)]
print(x)

Prints:

     name  score
2   bruno     15
3   bruno      0
6    paul     25
7    paul      0
8    paul     10
9  marcus      5
Sign up to request clarification or add additional context in comments.

6 Comments

Very lean solution. I guess a little explanation would be great. Cheers!
Question, what happens if the first entry for a name is 0? isnt that entry erased?
@FloLie It's doing cumsum() in groups. so if first entry in the group is 0, the entry is erased (until first non-zero one).
@FloLie I think yes, based on "I'm trying to keep only the rows for each user where score > 1 starts"
I should have read more closely. You are right
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.