1

Can anyone help, i'm new to Python so bear with me.

My data looks like this but has all the region information available. I'm trying to create a new column 'actual price' that works out the price based on the region. as for every entry I have each price for every region. is this possible.

data = [[1, 'EDF', 'Eastern', 400, 500, 300], [2, 'EDF', 'Southern', 200, 100, 300], [3, 'NPower', 
        'Eastern', 600, 500, 700]] 


df = pd.DataFrame(data, columns = ['ID', 'Supplier', 'Region', 'Av Price', 'Eastern Price',  
'Southern Price']) 

df
2
  • 2
    What is your expected output for the sample data? Commented Jun 1, 2020 at 14:30
  • sorry that would have been helpful so ID Supplier Region Av Price Eastern Price Southern Price Price 1 EDF Eastern 400 500 300 500 2 EDF Southern 200 100 300 300 THe idea is to get rid of all the regional prices and just have the actual price if that makes sense. Commented Jun 1, 2020 at 14:33

3 Answers 3

2

IIUC, you can do df.lookup here after adding " Price" to the values of the Region column to match the column names of the Price by region:

m = df.loc[:,df.columns.str.endswith("Price")]
df['actual_Price'] = m.lookup(df.index,df['Region'].add(" Price"))

print(df)
   ID Supplier    Region  Av Price  Eastern Price  Southern Price  \
0   1      EDF   Eastern       400            500             300   
1   2      EDF  Southern       200            100             300   
2   3   NPower   Eastern       600            500             700   

   actual_Price  
0           500  
1           300  
2           500  
Sign up to request clarification or add additional context in comments.

5 Comments

It appears this has worked but I have a warning SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame.
@LMR is your dataframe a reference of any other dataframe or is it the first/source dataframe?
it was created from taking columns from 2 other dataframes if that makes sense?
@LMR guess there is some issue there. use .copy() when referencing the other dataframes, check this
Thank you @anky that has solved the problem, you have probably jsut saved me a pile of time trying to work it out
1

I believe this is what you're looking for:

df["actual_price"] = np.where(df.Region == "Eastern", df["Eastern Price"], df["Southern Price"])

result:

enter image description here

Comments

1

Use, np.select:

conditions = [df['Region'].eq(reg) for reg in df['Region'].unique()]
choices = [df[f'{reg} Price'] for reg in df['Region'].unique()]
df['actual_price'] = np.select(conditions, choices)

Result:

# print(df)
   ID Supplier    Region  Av Price  Eastern Price  Southern Price  actual_price
0   1      EDF   Eastern       400            500             300           500
1   2      EDF  Southern       200            100             300           300
2   3   NPower   Eastern       600            500             700           500

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.