I'd like to merge to tables while replacing the null value in one table with the non-null values from another.
The code below is an example of the tables to be merged:
# Table 1 (has rows with missing values)
a=['x','x','x','y','y','y']
b=['z', 'z', 'z' ,'w', 'w' ,'w' ]
c=[1,1,1,np.nan, np.nan, np.nan]
table_1=pd.DataFrame({'a':a, 'b':b, 'c':c})
table_1
a b c
0 x z 1.0
1 x z 1.0
2 x z 1.0
3 y w NaN
4 y w NaN
5 y w NaN
# Table 2 (new table to be appended to table_1, and would need to use values in column 'c' to replace values in the same column in table_1)
a=['y', 'y', 'y']
b=['w', 'w', 'w']
c=[2,2,2]
table_2=pd.DataFrame({'a':a, 'b':b, 'c':c})
table_2
a b c
0 y w 2
1 y w 2
2 y w 2
This is the code I use for merging the 2 tables, and the ouput I get
# Merging the two tables
merged_table=pd.merge(table_1, table_2, on=['a', 'b'], how='left')
merged_table
Current output (I don't understand why the number of rows is increased):
a b c_x c_y
0 x z 1.0 NaN
1 x z 1.0 NaN
2 x z 1.0 NaN
3 y w NaN 2.0
4 y w NaN 2.0
5 y w NaN 2.0
6 y w NaN 2.0
7 y w NaN 2.0
8 y w NaN 2.0
9 y w NaN 2.0
10 y w NaN 2.0
11 y w NaN 2.0
Desired output (to replace the null values in the 'c' column in table_1 with the numeric values from table_2):
a b c
0 x z 1.0
1 x z 1.0
2 x z 1.0
3 y w 2.0
4 y w 2.0
5 y w 2.0