2

I have two dataframes. df1 and df2. This is the content of df1

  col1  col2  col3
0    1    12   100
1    2    34   200
2    3    56   300
3    4    78   400

This is the content of df2

  col1  col2  col3
0    2  1984   500
1    3  4891   600

I want to have this final data frame:

  col1  col2  col3
0    1    12   100
1    2  1984   200
2    3  4891   300
3    4    78   400

Note that col1 is the primary key in df1 and df2. I tried to do it via mapping values, but I could not make it work.

Here is an MCVE for checking those data frames easily:

import pandas as pd
d = {'col1': ['1', '2','3','4'], 'col2': [12, 34,56,78],'col3':[100,200,300,400]}
df1 = pd.DataFrame(data=d)
d = {'col1': ['2','3'], 'col2': [1984,4891],'col3':[500,600]}
df2 = pd.DataFrame(data=d)
print(df1)
print(df2)
d = {'col1': ['1', '2','3','4'], 'col2': [12, 1984,4891,78],'col3':[100,200,300,400]}
df_final = pd.DataFrame(data=d)
print(df_final)

2 Answers 2

1

You can map and fillna:

df1['col2'] = (df1['col1']
               .map(df2.set_index('col1')['col2'])
               .fillna(df1['col2'], downcast='infer')
              )

output:

  col1  col2  col3
0    1    12   100
1    2  1984   200
2    3  4891   300
3    4    78   400
Sign up to request clarification or add additional context in comments.

Comments

1

If col1 is unique, combine_first is an option, too:

>>> (df2.drop("col3", axis=1)
        .set_index("col1")
        .combine_first(df1.set_index("col1"))
        .reset_index()
    )
  col1  col2  col3
0    1    12   100
1    2  1984   200
2    3  4891   300
3    4    78   400

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.