3

here is my dataframe:

df = [{'id': 1, 'name': 'bob', 'apple': 45, 'grape': 10, 'rate':0}, 
      {'id': 1, 'name': 'bob', 'apple': 45, 'grape': 20, 'rate':0},
      {'id': 2, 'name': 'smith', 'apple': 5, 'grape': 30, 'rate':0},
      {'id': 2, 'name': 'smith', 'apple': 10, 'grape': 40, 'rate':0}]

i would like to: where apple= apple.sum() and grape=grape.sum() and rate = grape/apple*100.

       id           name     apple    grape   rate
0       1            bob      90       30      300 
1       2           smith     15       70      21.4

I have attempted this with the following:

df = pd.DataFrame(df)
def cal_rate(rate):
    return df['apple'] / df['grape']*100
agg_funcs = {'apple':'sum',
             'grape':'sum',
             'rate' : cal_rate}
df=df.groupby(['id','name').agg(agg_funcs).reset_index()

But got this result:

       id           name     apple    grape   rate
0       1            bob      90       30      105 
1       2           smith     15       70      105

Can you help me out?thanks in advance.

3 Answers 3

1

Here you go:

import pandas as pd

df = [{'id': 1, 'name': 'bob', 'apple': 45, 'grape': 10, 'rate':0},
      {'id': 1, 'name': 'bob', 'apple': 45, 'grape': 20, 'rate':0},
      {'id': 2, 'name': 'smith', 'apple': 5, 'grape': 30, 'rate':0},
      {'id': 2, 'name': 'smith', 'apple': 10, 'grape': 40, 'rate':0}]
df = pd.DataFrame(df)


def cal_rate(group):
    frame = df.loc[group.index]
    return frame['apple'].sum()  / frame['grape'].sum() * 100
agg_funcs = {'apple':'sum',
             'grape':'sum',
             'rate' : cal_rate}
df=df.groupby(['id','name']).agg(agg_funcs).reset_index()
print(df)

Output

   id   name  apple  grape   rate
0   1    bob     90     30  300.0
1   2  smith     15     70   21.4
Sign up to request clarification or add additional context in comments.

2 Comments

but why did you pass group as parameter in cal_rate function? can i do this in one line or something like that? 'apple':'sum', 'grape':'sum', 'rate' : frame['apple'].sum() / frame['grape'].sum() * 100 @balaji
@ahmad You have others showing you that. I was showing the usage of a custom function from an agg context.
1

You can also do it this way

df = df.groupby(['id', 'name']).agg({'apple':'sum', 'grape':'sum'}).reset_index()
df['rate'] = (df['apple'] / df['grape']) *100

1 Comment

Could simply do df.groupby(['id', 'name'], as_index=False).sum() instead of agg.
1

just another way to do this

import pandas as pd
df = [{'id': 1, 'name': 'bob', 'apple': 45, 'grape': 10, 'rate':0},
      {'id': 1, 'name': 'bob', 'apple': 45, 'grape': 20, 'rate':0},
      {'id': 2, 'name': 'smith', 'apple': 5, 'grape': 30, 'rate':0},
      {'id': 2, 'name': 'smith', 'apple': 10, 'grape': 40, 'rate':0}]
df = pd.DataFrame(df)
df=df.groupby(['id','name']).sum().reset_index()
df['rate']=round((df['apple'] / df['grape'])*100,1)
print(df)

output

   id   name  apple  grape   rate
0   1    bob     90     30  300.0
1   2  smith     15     70   21.4

2 Comments

Why sum the id column?
Did you forget to reset_index() ?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.