I have two dataframes :
pd.DataFrame(data={'col1': ['a', 'b', 'a', 'a', 'b'], 'col2': ['c', 'c', 'd', 'd', 'c'], 'col3': [1, 2, 3, 4, 5, 1]})
col1 col2 col3
0 a c 1
1 b c 2
2 a d 3
3 a d 4
4 b c 5
5 h i 1
pd.DataFrame(data={'col1': ['a', 'b', 'a', 'f'], 'col2': ['c', 'c', 'd', 'k'], 'col3': [12, 23, 45, 78]})
col1 col2 col3
0 a c 12
1 b c 23
2 a d 45
3 f k 78
and I'd like to build a new column in the first one according to the values of col1 and col2 that can be found in the second one. That is this new one :
pd.DataFrame(data={'col1': ['a', 'b', 'a', 'a', 'b'], 'col2': ['c', 'c', 'd', 'd', 'c'], 'col3': [1, 2, 3, 4, 5],'col4' : [12, 23, 45, 45, 23]})
col1 col2 col3 col4
0 a c 1 12
1 b c 2 23
2 a d 3 45
3 a d 4 45
4 b c 5 23
5 h i 1 NaN
How am I able to do that ?
Tks for your attention :)
Edit : it has been adviced to look for the answer in this subject Adding A Specific Column from a Pandas Dataframe to Another Pandas Dataframe but it is not the same question.
In here, not only the ID does not exist since it is splitted in col1 and col2 but above all, although being unique in the second dataframe, it is not unique in the first one. This is why I think that neither a merge nor a join can be the answer to this.
Edit2 : In addition, couples col1 and col2 of df1 may not be present in df2, in this case NaN is awaited in col4, and couples col1 and col2 of df2 may not be needed in df1. To illustrate these cases, I addes some rows in both df1 and df2 to show how it could be in the worst case scenario
df1.merge(df2.rename(columns={'col3': 'col4'}))