Pandas - divide DataFrame values by a Series in a MultiIndex DataFrame

Question

Suppose I have a multi-index DataFrame such as this:

import numpy as np
import pandas as pd

ix = pd.MultiIndex.from_product([['bucket 1', 'bucket 2'], ['q1', 'q2', 'q3']])
col = ['col1', 'col2', 'col3']
df = pd.DataFrame(np.random.randn(6, 3), ix, col)

Output:

                 col1      col2      col3
bucket 1 q1  0.061384  0.364194 -1.502486
         q2  0.562352 -0.044836  0.242474
         q3  0.373411 -0.678429 -1.261984
bucket 2 q1  0.884109 -0.070899  0.085305
         q2 -0.010463  1.463259 -0.572882
         q3 -0.419821 -0.916151  0.032110

Now I create a Series with the index matching the columns of my DataFrame:

s = pd.Series([1,2,3], index=["col1", "col2", "col3"])

I can divide the values in bucket 1 in the DataFame by the Series like so:

df.loc["bucket 1"].div(s)

Output:

    col1      col2      col3
q1  0.061384  0.182097 -0.500829
q2  0.562352 -0.022418  0.080825
q3  0.373411 -0.339214 -0.420661

However if I try to use this calculation to set the values in the DataFrame using .loc, it just creates NaNs:

df.loc["bucket 1"] = df.loc["bucket 1"].div(s)

Output:

                 col1      col2      col3
bucket 1 q1       NaN       NaN       NaN
         q2       NaN       NaN       NaN
         q3       NaN       NaN       NaN
bucket 2 q1  0.884109 -0.070899  0.085305
         q2 -0.010463  1.463259 -0.572882
         q3 -0.419821 -0.916151  0.032110

What am I doing wrong? How do I make the calculations in the DataFrame?

Do you need to divide only one group by s or ultimately apply this to the entire DataFrame? — ALollz
– ALollz, Commented Mar 11, 2019 at 16:55
In my real use-case I will have a different Series to divide each bucket by. — Toby Petty
– Toby Petty, Commented Mar 11, 2019 at 17:58

anky · Accepted Answer · 2019-03-11 16:49:18Z

3

just Use .values to set the values:

df.loc["bucket 1"]=df.loc["bucket 1"].div(s).values
print(df)

                 col1      col2      col3
bucket 1 q1 -0.149856  0.220604 -0.048464
         q2 -0.791260 -0.199646  0.148115
         q3 -0.712257 -0.264074 -0.266497
bucket 2 q1 -1.120164 -0.290546  0.577589
         q2 -0.149522 -0.221203 -0.566872
         q3 -1.002036 -2.233220 -1.206849

answered Mar 11, 2019 at 16:49

anky

75.3k11 gold badges46 silver badges76 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

danuker Over a year ago

Thanks! I find it very strange that to this day, you can't just replace a section of a DataFrame in-place. and you need .values.

BENY · Accepted Answer · 2019-03-11 17:01:10Z

2

You just missing a []

df.loc[["bucket 1"]]=df.loc[["bucket 1"]].div(s)
df
Out[1092]: 
                 col1      col2      col3
bucket 1 q1 -1.016733 -1.334495  0.417621
         q2  0.892984 -0.329325  0.224591
         q3  1.438399 -0.094883  0.053133
bucket 2 q1 -0.062476  0.962616  0.457755
         q2  0.389670  1.238829  0.390253
         q3  0.713873 -0.034645 -0.148381

answered Mar 11, 2019 at 17:01

BENY

324k22 gold badges176 silver badges250 bronze badges

Collectives™ on Stack Overflow

Pandas - divide DataFrame values by a Series in a MultiIndex DataFrame

2 Answers 2

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related