using nested groupby commands to transform column in pandas

Question

I have the following simple dataframe.

data = [['Cntrl', 0.844961, '201030'],
        ['Cntrl', 0.815331, '201030'],
        ['Exp', 1.530297, '201030'],
        ['Exp', 2.36416, '201030'],
        ['Cntrl', 1.867359, '201024'],
        ['Cntrl', 1.274295, '201024'],
        ['Exp', 0.7058, '201024'],
        ['Exp', 2.657393, '201024'],]

df = pd.DataFrame(data, columns=['Group', 'Value', 'Date'])
df

    Group   Value       Date
0   Cntrl   0.844961    201030
1   Cntrl   0.815331    201030
2   Exp     1.530297    201030
3   Exp     2.364160    201030
4   Cntrl   1.867359    201024
5   Cntrl   0.705800    201024
6   Exp     1.274295    201024
7   Exp     2.657393    201024

I would like to divide the values for each date by the minimum Cntrl value for that date, producing:


    Group   Value   Date
0   Cntrl   1.0363  201030
1   Cntrl   1.0000  201030
2   Exp     1.8769  201030
3   Exp     2.8996  201030
4   Cntrl   1.4654  201024
5   Cntrl   1.0000  201024
6   Exp     0.5538  201024
7   Exp     2.0853  201024

I gather that I can divide all values from each date by the minimum of all all values for that date using: df['Value'] = df.groupby('Date')['Value'].transform(lambda x: x / x.min())

However, this lumps both group values together instead of specifically using the Cntrl group to calculate the minimum. Is there a way to nest an additional groupby call into the lambda function to, or set up the transform differently, to accomplish this?

BENY · Accepted Answer · 2022-06-01 15:18:08Z

2

Doing groupby with reindex

df.Value /= df[df['Group'].eq('Cntrl')].groupby('Date')['Value'].min().reindex(df['Date']).values
df
Out[172]: 
   Group     Value    Date
0  Cntrl  1.036341  201030
1  Cntrl  1.000000  201030
2    Exp  1.876903  201030
3    Exp  2.899632  201030
4  Cntrl  1.465406  201024
5  Cntrl  1.000000  201024
6    Exp  0.553875  201024
7    Exp  2.085383  201024

answered Jun 1, 2022 at 15:18

BENY

324k22 gold badges176 silver badges250 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

sammywemmy · Accepted Answer · 2022-06-06 04:30:20Z

1

One option is to get the groupby output of both Group and Date, and subsequently compute the Value column:

out = df.groupby(['Group', 'Date']).Value.min().loc['Cntrl']
df.assign(Value = df.Value/df.Date.map(out))

   Group     Value    Date
0  Cntrl  1.036341  201030
1  Cntrl  1.000000  201030
2    Exp  1.876903  201030
3    Exp  2.899632  201030
4  Cntrl  1.465406  201024
5  Cntrl  1.000000  201024
6    Exp  0.553875  201024
7    Exp  2.085383  201024

answered Jun 6, 2022 at 4:30

sammywemmy

28.9k4 gold badges21 silver badges35 bronze badges

Collectives™ on Stack Overflow

using nested groupby commands to transform column in pandas

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related