1

Want to apply custom function in a Dataframe eg. Dataframe

    index City  Age 
0   1    A    50    
1   2    A    24    
2   3    B    65    
3   4    A    40     
4   5    B    68    
5   6    B    48    

Function to apply

def count_people_above_60(age):
     **    ***                       #i dont know if the age can or can't be passed as series or list to perform any operation later
     return count_people_above_60 

expecting to do something like

df.groupby(['City']).agg{"AGE" : ["mean",""count_people_above_60"]}

expected Output

City  Mean People_Above_60
 A    38    0
 B    60.33    2

1 Answer 1

2

If performance is important create new column filled by compared values converted to integers, so for count is used aggregation sum:

df = (df.assign(new = df['Age'].gt(60).astype(int))
        .groupby(['City'])
        .agg(Mean= ("Age" , "mean"), People_Above_60= ('new',"sum")))
print (df)
           Mean  People_Above_60
City                            
A     38.000000                0
B     60.333333                2

Your solution should be changed with compare values and sum, but is is slow if many groups or large DataFrame:

def count_people_above_60(age):
    return (age > 60).sum()

df = (df.groupby(['City']).agg(Mean=("Age" , "mean"), 
                               People_Above_60=('Age',count_people_above_60)))
print (df)
           Mean  People_Above_60
City                            
A     38.000000                0
B     60.333333                2
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.