2

I'm currently having a problem with filling the missing values of my dataframe using a different dataframe.

Data samples:

df1

A   B       C
b   1.0     1.0
d   NaN     NaN
c   2.0     2.0
a   NaN     NaN
f   NaN     NaN

df2

A   B   C
c   1   5
b   2   6
a   3   7
d   4   8

I've tried to follow the solution in this question but it would appear that it is only possible if the values you're looking up is present in both dataframes you're joining.

My attempt

mask = df1["B"].isnull()
df1.loc[mask, "B"] = df2[df1.loc[mask, "A"]].values

Error:

"None of [Index(['d', 'a', 'f'], dtype='object')] are in the [columns]"

Expected result:

A   B       C
b   1.0     1.0
d   4.0     8.0
c   2.0     2.0
a   3.0     7.0
f   NaN     NaN

Also, can it be used it fill two columns?

1 Answer 1

4

You can use combine_first here, which is exactly aimed at filling NaNs by matching with another dataframe's columns:

df1.set_index('A').combine_first(df2.set_index('A')).reset_index()

   A    B    C
0  a  3.0  7.0
1  b  1.0  1.0
2  c  2.0  2.0
3  d  4.0  8.0
4  f  NaN  NaN
Sign up to request clarification or add additional context in comments.

1 Comment

I didn't know this method existed and that's even less complicated that it looks. Thanks man! Stay at home!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.