0

I'm trying to learn how to use python and I've never used pandas before. I wanted to create a simple calculation using excel data - here's an example of the excel data:

Example Data

There are 3 columns, Unique ID, Vehicle and the Hours.

I know how to make a simple calculator on python where I have to manually enter these values but is it possible to extract data from the columns directly to get the calculation?

So ideally, it would pick up the ID itself, then the vehicle type with pay (the pay defined within the code e.g. Bike = 15.00), multiplied by number of hours to give a total of the pay?

ID: 28392

Vehicle: Bike

Hours: 40

Total: $600

Hopefully this makes sense, thanks in advance!

1
  • Use the data.loc[] or data.iloc[] methods Commented Oct 29, 2021 at 14:12

1 Answer 1

1

First you need to load your dataset into a pandas dataframe which you can do using following command.

import pandas as pd 
df = pd.read_excel('name_of_file_here.xlsx',sheet_name='Sheet_name_here')

So your excel data is now a pandas dataframe called df.

if pay rate is the same for all vehicles you can do the following.

rate = 15
df['Pay'] = df['Hours']*rate

This creates a new column in your dataframe called 'Pay' and multiplies the rate which is 15 by each row in the Hours Column.

If however the rate is different for different vehicles types you can do the following.

bike_rate = 15
df[df['Vehicle']=='Bike'] = df[df['Vehicle']=='Bike']*bike_rate
cargo_bike_rate = 20
df[df['Vehicle']=='Cargo-Bike'] = df[df['Vehicle']=='Cargo-Bike']*cargo_bike_rate 

This will select the rows in the dataframe where vehicle is equal to whatever type you want and operate on these rows.

Another way and the best way i think is to use a function.

def calculate_pay(vehicle,hours):
if vehicle == 'Bike':
    rate = 15
elif vehicle == 'Cargo-Bike':
    rate = 20
#And so on ....
else:
    rate = 10
return hours*rate

Then you can apply this function to your dataframe.

df['Pay'] = df.apply(lambda x: calculate_pay(x['Vehicle'],x['Hours']),axis=1)

This creates a new column in your dataframe called 'Pay' and applies a function called calculate_pay which takes inputs vehicle and hours from the dataframe and returns the pay.

To print your results on screen you can simply just type df and enter if you are using jupyter notebook, or to select specific columns you mentioned in comments you can do the following.

df[['Id','Vehicle','Hours','Pay']]

To save back to excel you can do the following

df.to_excel('output.xslx')
Sign up to request clarification or add additional context in comments.

3 Comments

Hi, thank you for your answer! I decided to use the function as it makes the code seem a lot cleaner and easier to read. When I run the code, it comes up with a unexpected EOF error for line 14 which is empty. How can I resolve this? Also I'd like to print the results out so that it shows ID, Vehicle, Hours and then the new column Total Pay. Will I be able to export this back into excel as well?
I figured out the EOF Error but now it comes up with ValueError
@cchev Please see the edit i made when creating new column 'Pay', using lambda now will stop the value error. To print the results you see the other edits i have made towards the end of the answer.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.