1

Hi I have 2 data sets:

Data A:

Column A  Column B  Column C    
Hello      NaN      John    
Bye        NaN      Mike

Data B:

Column A  Column B    
Hello      123

Raw data:

a = pd.DataFrame([['Hello', np.nan,'John'],['Bye',np.nan,'Mike']], columns=['Column A','Column B','Column C'])
b = pd.DataFrame([['Hello', 123]], columns=['Column A','Column B'])

I want to merge Data A & B using left join (as Data A should be the main data and only bring in if they have matching Column A on Data B), and want to bring in Data B's Column B's numeric onto Data A's Column B.

The columns match but my script below results in two Column B's.

df=a.merge(b, on ='Column A', how='left')

df:

Column A  Column B_x  Column C  Column B_y
Hello      NaN        John       123
Bye        NaN        Mike

I want the following result:

Column A  Column B  Column C
Hello       123      John
Bye         NaN      Mike

Please note I need to effectively insert Column B's data correlating to Column A, not just push Data B into Data A in exact row order. I need the code to find the match for Column A regardless of which row it's located in and insert them appropriately.

1 Answer 1

1

You don't need a merge for this as a merge will bring the columns of the two dataframes together. Since your dataframes follow the same structure, fillna or update:

a.fillna(b, inplace = True) # not in place unless you specify inplace=True 
a.update(b) # modifies NA in place using non-NA values from another DataFrame

print(a)

  Column A  Column B Column C
0    Hello     123.0     John
1      Bye       NaN     Mike
Sign up to request clarification or add additional context in comments.

5 Comments

Thanks. This seems to be working. Let me test a few more times and I'll select you as the answer. It does look like a.update(b) works and not a.fillna(b), however.
Welcome. Maybe a.fillna(b) is working, it's just not modifying your dataframe. So can you try changing it to new_a = a.fillna(b), or a.fillna(b, inplace = True) ?
Have you tried it? @LEOMODE
Sorry but this still didn't fix the problem, because when I merge it with a different column, it just merges without actually matching the same rows. So if I have multiple lines of various numbers that do not necessarily correlate to each line of rows, all your code does is just insert Data B into the original order. So say if I have 10 lines and 123 should go into 5 axis, because 123 was in axis 0 it just merges on axis 0. Both codes you suggested do the same as below guy. @sophocles
Hi @LEOMODE. My answer works perfectly with the sample data you have provided above, I just re-tested it and I can confirm it. Can you please try to update it with your issue and I will try to solve? Try to update your sample data enough (i.e. to a point that I can recreate the issue you are running into) and ask clearly what exactly is you're trying to get to. Please do that and i will do my best to help

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.