2

given a dataframe with two columns: User and Code, how can I filter out the user entries where they don't have at least x entries with a given Code?

E.g. I'd like to filter out all users when they don't have at least 5 occurances of a given type:

User    Type
A       Alpha
A       Alpha
A       Alpha
A       Alpha
A       Alpha
A       Beta
A       Beta
A       Beta
B       Alpha
B       Alpha
B       Alpha
B       Alpha
B       Alpha

Here I would like to filter out(remove) the 4x A with the Beta code (only 4 times here), while keeping everything else.

Thanks!

2 Answers 2

1

You can groupby on 'User' and 'Type' and filter:

In [91]:
df.groupby(['User', 'Type']).filter(lambda x: len(x) > 4)

Out[91]:
   User   Type
0     A  Alpha
1     A  Alpha
2     A  Alpha
3     A  Alpha
4     A  Alpha
8     B  Alpha
9     B  Alpha
10    B  Alpha
11    B  Alpha
12    B  Alpha
Sign up to request clarification or add additional context in comments.

2 Comments

this is perfect indeed! Thanks so much! I tried altering the labnda function so that it does one more step - filters out all the records where the sum of another field is smaller than 10. arr.groupby(['User','TYPE', 'APPDOMAIN']).filter(lambda x: sum(arr.Closed) > 10) Didn't quite do it though. Any last advice? Thanks again so much!
the lambda is passed the group as x, so it should be (...).filter(lambda x : sum(x.Closed)) > 10 instead of ... arr.Closed ...
1

how can I filter out the user entries where they don't have at least x entries with a given Code?

If you want to know which ones were kept or removed:

# counts
grouped = df.groupby(['User', 'Type']).apply(lambda g : len(g) > 4) 
grouped = grouped.reset_index(name='keep')
# merge back and filter
data = df.merge(grouped).query('keep == True')
removed = df.merge(grouped).query('keep == False')

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.