2

I have a large data set in the following format:

import pandas as pd
df1 = pd.DataFrame({'Date':['01/02/2020' , '01/03/2020', '01/04/2020','01/02/2020' , '01/03/2020', '01/04/2020','01/02/2020' , '01/03/2020', '01/04/2020'],
           'From': ['RU', 'RU', 'RU','USA', 'USA', 'USA','ME', 'ME', 'ME'],
           'To': ['JK', 'JK', 'JK','JK', 'JK', 'JK','JK', 'JK', 'JK'],
           'Distance':[ 40000, 40000, 40000,30000, 30000, 30000,20000, 20000, 20000],
           'Days': [8,8,8,6,6,6,4,4,4]})

df2 = pd.DataFrame({'Date':['01/02/2020' , '01/03/2020', '01/04/2020','01/02/2020' , '01/03/2020', '01/04/2020','01/02/2020' , '01/03/2020', '01/04/2020'],
           'Contract': ['OrangeTier', 'OrangeTier', 'OrangeTier','AppleTier', 'AppleTier', 'AppleTier','GrapeTier', 'GrapeTier', 'GrapeTier'],
           'Price':[ 10000, 15000, 20000,30000, 35000, 1000,45000, 20000, 21000]})

I would like to add a column to df1, which looks up the Contract 'OrangieTier', matches the dates in df1 with df2 and returns the price. Resulting in the dataframe looking something like this:

df1 = pd.DataFrame({'Date':['01/02/2020' , '01/03/2020', '01/04/2020','01/02/2020' , '01/03/2020', '01/04/2020','01/02/2020' , '01/03/2020', '01/04/2020'],
           'From': ['RU', 'RU', 'RU','USA', 'USA', 'USA','ME', 'ME', 'ME'],
           'To': ['JK', 'JK', 'JK','JK', 'JK', 'JK','JK', 'JK', 'JK'],
           'Distance':[ 40000, 40000, 40000,30000, 30000, 30000,20000, 20000, 20000],
           'OrangeTier':[10000, 15000, 20000,10000, 15000, 20000,10000, 15000, 20000],
           'Days': [8,8,8,6,6,6,4,4,4]})

I then want to multiply OrangeTier by Days and overwrite the OrangTier column with the result.

2
  • 1
    Are you going to attempt to solve it yourself? Commented Jan 16, 2020 at 19:03
  • I looked for two days on here and tried different ways. I thought it was better to see what someone suggested without posting up my attempt. I'm not lazy, still new to coding and needed help. Commented Jan 16, 2020 at 22:46

1 Answer 1

3

Let's try:

mapper = df2.query('Contract == "OrangeTier"').set_index(['Date'])['Price']

df1['OrangeTier'] = df1['Date'].map(mapper)

df1.assign(OrangeTier=df1['OrangeTier'] * df1['Days'])

Output:

         Date From  To  Distance  Days  OrangeTier
0  01/02/2020   RU  JK     40000     8       80000
1  01/03/2020   RU  JK     40000     8      120000
2  01/04/2020   RU  JK     40000     8      160000
3  01/02/2020  USA  JK     30000     6       60000
4  01/03/2020  USA  JK     30000     6       90000
5  01/04/2020  USA  JK     30000     6      120000
6  01/02/2020   ME  JK     20000     4       40000
7  01/03/2020   ME  JK     20000     4       60000
8  01/04/2020   ME  JK     20000     4       80000
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks very much. This was very helpful.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.