Assign values to multicolumn dataframe using another dataframe

Question

I am trying to assign values to a multicolumn dataframe that are stored in another normal dataframe. The 2 dataframes share the same index, however when attempting to assign the values for all columns of the normal dataframe to a slice of the multicolumn dataframe Nan values appear.

MWE

import pandas as pd

df = pd.DataFrame.from_dict(
    {
        ("old", "mean"): {"high": 0.0, "med": 0.0, "low": 0.0},
        ("old", "std"): {"high": 0.0, "med": 0.0, "low": 0.0},
        ("new", "mean"): {"high": 0.0, "med": 0.0, "low": 0.0},
        ("new", "std"): {"high": 0.0, "med": 0.0, "low": 0.0},
    }
)

temp = pd.DataFrame.from_dict(
    {
        "old": {
            "high": 2.6798302797288174,
            "med": 10.546654056177656,
            "low": 16.46382603916123,
        },
        "new": {
            "high": 15.91881231611413,
            "med": 16.671967271277495,
            "low": 26.17872356316402,
        },
    }
)

df.loc[:, (slice(None), "mean")] = temp
print(df)

Output:

      old       new     
     mean  std mean  std
high  NaN  0.0  NaN  0.0
med   NaN  0.0  NaN  0.0
low   NaN  0.0  NaN  0.0

Is this expected behaviour or am I doing something horrible that I am not supposed?

Pandas fully aligns across the axes. If you look at your slice: df.loc[:, (slice(None), "mean")] you still have a MultiIndex on the columns but temp does not, so because nothing aligns everything gets NaNd. I.e. just make temp have a MultiIndex too: temp.columns = pd.MultiIndex.from_product([temp.columns, ['mean']]) — ALollz
– ALollz, Commented Jan 19, 2021 at 14:08
@ALollz thanks I see, is there an easy way to resolve this other than looping through each column of temp and assigning it to the corresponding col of df? — gnikit
– gnikit, Commented Jan 19, 2021 at 14:13

jezrael · Accepted Answer · 2021-01-19 14:13:33Z

3

Create MultiIndex in temp for align data and then you can set new values by DataFrame.update:

temp.columns = pd.MultiIndex.from_product([temp.columns, ['mean']])
print (temp)
            old        new
           mean       mean
high   2.679830  15.918812
med   10.546654  16.671967
low   16.463826  26.178724

df.update(temp)
print(df)
            old             new     
           mean  std       mean  std
high   2.679830  0.0  15.918812  0.0
med   10.546654  0.0  16.671967  0.0
low   16.463826  0.0  26.178724  0.0

answered Jan 19, 2021 at 14:13

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Assign values to multicolumn dataframe using another dataframe

MWE

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

MWE

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related