0

I'd like to merge to tables while replacing the null value in one table with the non-null values from another.

The code below is an example of the tables to be merged:

# Table 1 (has rows with missing values)

a=['x','x','x','y','y','y']
b=['z', 'z', 'z' ,'w', 'w' ,'w' ]
c=[1,1,1,np.nan, np.nan, np.nan]

table_1=pd.DataFrame({'a':a, 'b':b, 'c':c})
table_1

    a   b   c
0   x   z   1.0
1   x   z   1.0
2   x   z   1.0
3   y   w   NaN
4   y   w   NaN
5   y   w   NaN

# Table 2 (new table to be appended to table_1, and would need to use values in column 'c' to replace values in the same column in table_1)

a=['y', 'y', 'y']
b=['w', 'w', 'w']
c=[2,2,2]
table_2=pd.DataFrame({'a':a, 'b':b, 'c':c})
table_2

    a   b   c
0   y   w   2
1   y   w   2
2   y   w   2

This is the code I use for merging the 2 tables, and the ouput I get

# Merging the two tables

merged_table=pd.merge(table_1, table_2, on=['a', 'b'], how='left')
merged_table

Current output (I don't understand why the number of rows is increased):

    a   b   c_x c_y
0   x   z   1.0 NaN
1   x   z   1.0 NaN
2   x   z   1.0 NaN
3   y   w   NaN 2.0
4   y   w   NaN 2.0
5   y   w   NaN 2.0
6   y   w   NaN 2.0
7   y   w   NaN 2.0
8   y   w   NaN 2.0
9   y   w   NaN 2.0
10  y   w   NaN 2.0
11  y   w   NaN 2.0

Desired output (to replace the null values in the 'c' column in table_1 with the numeric values from table_2):

    a   b   c
0   x   z   1.0
1   x   z   1.0
2   x   z   1.0
3   y   w   2.0
4   y   w   2.0
5   y   w   2.0

1 Answer 1

1

try:

out=table_1.append(table_2).dropna(subset=['c']).reset_index(drop=True)
#OR
out=pd.concat([table_1,table_2]).dropna(subset=['c']).reset_index(drop=True)

output of out:

    a   b   c
0   x   z   1.0
1   x   z   1.0
2   x   z   1.0
3   y   w   2.0
4   y   w   2.0
5   y   w   2.0
Sign up to request clarification or add additional context in comments.

2 Comments

thanks @Anurag, it works as I need it to. Do you know what might be happening wit hthe extra rows when I initially merged the 2 tables?
@Pablo when you initialy merged the both df's then you can fill value of c_x with c_y then drop c_y and rename c_x

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.