Dataframe is not updated when columns are passed to function using apply

Question

I have two dataframes like this:

   A   B
a  1  10
b  2  11
c  3  12
d  4  13 

   A   B
a  11 NaN
b NaN NaN
c NaN  20
d  16  30

They have identical column names and indices. My goal is to replace the NAs in df2 by the values of df1. Currently, I do this like this:

import pandas as pd
import numpy as np

df1 = pd.DataFrame({'A': range(1, 5), 'B': range(10, 14)}, index=list('abcd'))
df2 = pd.DataFrame({'A': [11, np.nan, np.nan, 16], 'B': [np.nan, np.nan, 20, 30]}, index=list('abcd'))    

def repl_na(s, d):

    s[s.isnull().values] = d[s.isnull().values][s.name]

    return s    

df2.apply(repl_na, args=(df1, ))

which gives me the desired output:

My question is now how this could be accomplished if the indices of the dataframes are different (column names are still the same, and the columns have the same length). So I would have a df2 like this(df1 is unchanged):

    A   B
0  11 NaN
1 NaN NaN
2 NaN  20
3  16  30

Then the above code does not work anymore since the indices of the dataframes are different. Could someone tell me how the line

s[s.isnull().values] = d[s.isnull().values][s.name]

has to be modified in order to get the same result as above?

Joachim Isaksson · Accepted Answer · 2016-03-09 17:38:51Z

3

You could temporarily change the indexes on df1 to be the same as df2and just combine_first with df2;

df2.combine_first(df1.set_index(df2.index))

    A   B
1  11  10
2   2  11
3   3  20
4  16  30

answered Mar 9, 2016 at 17:38

Joachim Isaksson

182k28 gold badges297 silver badges307 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Cleb Over a year ago

That's great! Even avoids the apply. I upvote it for now and might accept it later on depending on the other answers' quality.

Cleb Over a year ago

Could ayou also explain why my attempt fails and how it has to be modified to make it work?!

Joachim Isaksson Over a year ago

@Cleb Sadly, I'm more of a beginner in pandas, so I'm not sure I can give a correct explanation of why your existing code breaks. If that's what you need, someone else may have to go into that analysis.

Cleb Over a year ago

Ok. Your code gets the job done for this particular problem. Maybe I will ask a follow-up question asking specifically about this problem.

Collectives™ on Stack Overflow

Dataframe is not updated when columns are passed to function using apply

1 Answer 1

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related