Hi I have 2 data sets:
Data A:
Column A Column B Column C
Hello NaN John
Bye NaN Mike
Data B:
Column A Column B
Hello 123
Raw data:
a = pd.DataFrame([['Hello', np.nan,'John'],['Bye',np.nan,'Mike']], columns=['Column A','Column B','Column C'])
b = pd.DataFrame([['Hello', 123]], columns=['Column A','Column B'])
I want to merge Data A & B using left join (as Data A should be the main data and only bring in if they have matching Column A on Data B), and want to bring in Data B's Column B's numeric onto Data A's Column B.
The columns match but my script below results in two Column B's.
df=a.merge(b, on ='Column A', how='left')
df:
Column A Column B_x Column C Column B_y
Hello NaN John 123
Bye NaN Mike
I want the following result:
Column A Column B Column C
Hello 123 John
Bye NaN Mike
Please note I need to effectively insert Column B's data correlating to Column A, not just push Data B into Data A in exact row order. I need the code to find the match for Column A regardless of which row it's located in and insert them appropriately.