0

I want to fill up a DataFrame with a MultiIndex using values from a DataFrame with a simple Index, where the simple Index matches the MultiIndex'es first column. It should not touch anything where there isn't a match. I tried multiindexed.update(singleindexed, overwrite=False), but that didn't work.

import numpy as np
import pandas as pd

products = ['shoes']*5 + ['socks']*5 + ['boots']*5
years = [2005, 2006, 2007, 2008, 2009]*3
values = [11, np.NaN, 12, np.NaN, 7, 23, 8, 40, 5, np.NaN, np.NaN, 53, np.NaN, 88, 34]

df = pd.DataFrame({'product': products, 'year': years, 'value': values})
df = df.set_index(['product', 'year'])

df1 = pd.DataFrame({'products': ['shoes', 'socks'], 'value': [555555, 666666]})
df1 = df1.set_index('products')

The MultiIndex-ed data looks like:

               value
product  year    
shoes    2005    11
         2006    NaN
         2007    12
         2008    NaN
         2009    7
socks    2005    23
         2006    8
         2007    40
         2008    5
         2009    NaN
boots    2005    NaN
         2006    53
         2007    NaN
         2008    88
         2009    34

The single-Index-ed data looks like

product
shoes      555555
socks      666666

The result should look like

               value
product  year    
shoes    2005    11
         2006    555555
         2007    12
         2008    555555
         2009    7
boots    2005    NaN
         2006    53
         2007    NaN
         2008    88
         2009    34

What am I missing here? Is update even the right approach, or should I be looking at apply?

Edit Added the data to generate this example, and added an important omission: I forgot to mention that non-matching values should not be touched, hence I cannot use fillna(). df.update(df1, overwrite=False) would have seemed like a perfect and simple fit.

2
  • you are looking for fillna use it like df1.fillna(df2.set_index('product')) Commented Jan 18, 2020 at 19:21
  • This would indeed have worked on my original example, but I forgot to mention that it should not touch NaNs that didn't match. Apologies! Commented Jan 18, 2020 at 20:12

1 Answer 1

1

We can do reindex and slice assign

df.loc[df.value.isnull(),'value']=df1.reindex(df.index.get_level_values(0))[df.value.isnull().values].values
df
                 value
product year          
shoes   2005      11.0
        2006  555555.0
        2007      12.0
        2008  555555.0
        2009       7.0
socks   2005      23.0
        2006       8.0
        2007      40.0
        2008       5.0
        2009  666666.0
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.