2

I need to calculate different mathematical operations to the different variables in dataframe. I am having data as shown below:

 y    x1  x2 x3
 NB    1   4   2
 SK    2   5   3
 SK    3   6   6
 NB    4   7   9

I want to group mydata with y variable and have to calculate sum(x1),max(x2).Also, I have to apply some user_defined function to x3.

And I want my grouped output with only 4 variables y,x1,x2,x3 in pandas dataframe format as shown below.

 y    x1  x2 x3
 NB    5   7   5
 SK    5   6   5  

I tried some codes and i searched in different websites but i didn't get a required solution.

please anyone help me to tackle this.

Thanks in advance.

2 Answers 2

3

When you use .groupby, you can aggregate with .agg. There are certain predefined functions for use in this, but you can also apply whatever user-defined functions you want using lambda, where the argument passed to the function is the values for that group:

from io import StringIO

import pandas as pd


data = StringIO('''y    x1  x2 x3
NB    1   4   2
SK    2   5   3
SK    3   6   6
NB    4   7   9''')


def func(values):
    return sum(values)/50

df = pd.read_csv(data, sep='\s+')

summaries = df.groupby('y').agg({'x1': 'sum',
                                 'x2': 'max',
                                 'x3': lambda vals: func(vals)})

print(summaries)

This prints:

    x1  x2    x3
y               
NB   5   7  0.22
SK   5   6  0.18
Sign up to request clarification or add additional context in comments.

Comments

0
df.groupby(df.index)[‘x1’].agg(lambda x: sum(x.values)

You can change the lambda for whichever operation you are performing on a given column.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.