2

Suppose that I have 2 Python data frame2 named A and B like shown below. How could I replace column Value in data frame A based on the matches of columns ID and Month from B? Any ideas?

Thanks

Dataframe A:

ID  Month   City    Brand   Value
1   1   London  Unilever    100
1   2   London  Unilever    120
1   3   London  Unilever    150
1   4   London  Unilever    140
2   1   NY  JP Morgan   90
2   2   NY  JP Morgan   105
2   3   NY  JP Morgan   100
2   4   NY  JP Morgan   140
3   1   Paris   Loreal  60
3   2   Paris   Loreal  75
3   3   Paris   Loreal  65
3   4   Paris   Loreal  80
4   1   Tokyo   Sony    100
4   2   Tokyo   Sony    90
4   3   Tokyo   Sony    85
4   4   Tokyo   Sony    80

Dataframe B:

ID  Month   Value
2   1   100
3   3   80

2 Answers 2

4

Use merge with left join and replace missing values by original values by fillna:

df = df1.merge(df2, on=['ID', 'Month'], how='left', suffixes=('_',''))
df['Value'] = df['Value'].fillna(df['Value_']).astype(int)
df = df.drop('Value_', axis=1)
print (df)
    ID  Month    City      Brand  Value
0    1      1  London   Unilever    100
1    1      2  London   Unilever    120
2    1      3  London   Unilever    150
3    1      4  London   Unilever    140
4    2      1      NY  JP Morgan    100
5    2      2      NY  JP Morgan    105
6    2      3      NY  JP Morgan    100
7    2      4      NY  JP Morgan    140
8    3      1   Paris     Loreal     60
9    3      2   Paris     Loreal     75
10   3      3   Paris     Loreal     80
11   3      4   Paris     Loreal     80
12   4      1   Tokyo       Sony    100
13   4      2   Tokyo       Sony     90
14   4      3   Tokyo       Sony     85
15   4      4   Tokyo       Sony     80
Sign up to request clarification or add additional context in comments.

4 Comments

Thank you.I get an error, probably related with the "Value_" section. KeyError: 'Value_'. Any idea on the reason? Thank you
What is print (df1.columns) ? print (df2.columns) Because Value is last column from sample data and Value_ is too.
Yes I am aware of the purpose of the print statement, just wondering why I couldn't get to the same result you did?
So because in sample data is column called Value, keyerror means there is no column Value in real data.
0

Merge them and then remove the not used fields:

C = pd.merge(A[['ID', 'Month', 'City', 'Brand']],B, on=['ID', 'Month'])
C = C[['ID', 'Month', 'City', 'Brand', 'Value']]

This should work

3 Comments

Do you test it?
Hi,this is cool, but it only merges the dataframes with the filtering. My question is more on the lines of replacing the Value column data when the match exists in both columns. When translating to SQL would the the equivalent of an update statement with a join on 2 columns. any guesses? Cheers
@jezrael, yes but I see the problem.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.