2

I have two (or more) dataframes that I want to append under each other (or outer merge, in a way). How do I make sure that I can append the two dataframes, but at the same time, if an index is the same, I want to update the value of the variable with the second (dfB) dataframe. As an illustration:

dfA = 
Index Var1
A     5 
B     6
C     7

dfB = 
Index Var1
A     6
D     8
E     10

Desired output should look like

output = 
Index Var1
A     6
B     6
C     7
D     8
E     10

Any help would be greatly appreciated!

Thanks

2 Answers 2

1

For this particular case, considering the update, you can use pd.concat() with the argument ignore_index=True and drop_duplicates(['index'])

output = pd.concat([dfA,dfB],ignore_index=True)drop_duplicates(['index'],keep='last')

Example:

A = {'Index':['A','B','C'],'Var1':[5,6,7]}
B = {'Index':['A','D','E'],'Var1':[6,7,8]}
dfA = pd.DataFrame(A)
dfB = pd.DataFrame(B)
output = pd.concat([dfA,dfB],ignore_index=True).drop_duplicates(['Index'],keep='last')
print(output)

  Index  Var1
1     B     6
2     C     7
3     A     6
4     D     7
5     E     8

After this you can use set_index() or sort_values() if you want to sort your dataframe in alphabetical order given the column Index

Sign up to request clarification or add additional context in comments.

Comments

1

You can also merge and fillna:

final = (df1.merge(df2,on='Index',how='outer',suffixes=('_x',''))
       .assign(Var1 = lambda x: x['Var1'].fillna(x['Var1_x']))[df1.columns])

  Index  Var1
0     A   6.0
1     B   6.0
2     C   7.0
3     D   8.0
4     E  10.0

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.