0

I am looking to compare values of columns in two different datasets and create a column with the results that have matched.

DF1:

 Dates
0   2021-10-01
1   2021-10-02
2   2021-10-03
3   2021-10-04
4   2021-10-05
5   2021-10-06
6   2021-10-07

DF2 =

StartDate    User
2021-10-04   Doe, John
2021-10-07   Ann, Mary
2021-10-07   Doe, John

Expected Result

DF1:

 Dates            User
0   2021-10-01
1   2021-10-02
2   2021-10-03
3   2021-10-04    Doe, John
4   2021-10-05
5   2021-10-06
6   2021-10-07    Ann, Mary; Doe, John

I can see the matches using the following code:

df1= df2.loc[df2['StartDate'].isin(df1['Dates']), 'User']

And i can return a 1/0 with the following:

df1= df1.assign(result=df1['Dates'].isin(df2['StartDate']).astype(int))

However, i cannot seem to merge them both

1 Answer 1

1
df.join(df2.groupby('StartDate')['User'].apply('; '.join), how='left', on='Dates').fillna('')

Output:

>>> df
       Dates          User
0 2021-10-01              
1 2021-10-02              
2 2021-10-03              
3 2021-10-04         Test1
4 2021-10-05              
5 2021-10-06              
6 2021-10-07  Test2, Test1
Sign up to request clarification or add additional context in comments.

7 Comments

Thank you i can get this to work. However, i have made a slight change to my dataset which now shows the Users as 'Lastname, Firstname' and when i use this code it only pulls the one user not the two?
Are you saying that each value of the User is a string in the format 'firstname, lastname'?
correct. I have updated the question to show this
Check my latest update, @TiiTcHY.
it appears there was a whitespace in one of the columns that was affecting the search. I have now managed to fix this. Thank you for your help
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.