Calculations using pandas

Question

I'm trying to learn how to use python and I've never used pandas before. I wanted to create a simple calculation using excel data - here's an example of the excel data:

Example Data

There are 3 columns, Unique ID, Vehicle and the Hours.

I know how to make a simple calculator on python where I have to manually enter these values but is it possible to extract data from the columns directly to get the calculation?

So ideally, it would pick up the ID itself, then the vehicle type with pay (the pay defined within the code e.g. Bike = 15.00), multiplied by number of hours to give a total of the pay?

ID: 28392

Vehicle: Bike

Hours: 40

Total: $600

Hopefully this makes sense, thanks in advance!

Use the data.loc[] or data.iloc[] methods

Kevincheong
– Kevincheong

2021-10-29 14:12:46 +00:00
Commented Oct 29, 2021 at 14:12 — Kevincheong
– Kevincheong, Commented Oct 29, 2021 at 14:12

SSS · Accepted Answer · 2021-11-03 13:32:49Z

1

First you need to load your dataset into a pandas dataframe which you can do using following command.

import pandas as pd 
df = pd.read_excel('name_of_file_here.xlsx',sheet_name='Sheet_name_here')

So your excel data is now a pandas dataframe called df.

if pay rate is the same for all vehicles you can do the following.

rate = 15
df['Pay'] = df['Hours']*rate

This creates a new column in your dataframe called 'Pay' and multiplies the rate which is 15 by each row in the Hours Column.

If however the rate is different for different vehicles types you can do the following.

bike_rate = 15
df[df['Vehicle']=='Bike'] = df[df['Vehicle']=='Bike']*bike_rate
cargo_bike_rate = 20
df[df['Vehicle']=='Cargo-Bike'] = df[df['Vehicle']=='Cargo-Bike']*cargo_bike_rate

This will select the rows in the dataframe where vehicle is equal to whatever type you want and operate on these rows.

Another way and the best way i think is to use a function.

def calculate_pay(vehicle,hours):
if vehicle == 'Bike':
    rate = 15
elif vehicle == 'Cargo-Bike':
    rate = 20
#And so on ....
else:
    rate = 10
return hours*rate

Then you can apply this function to your dataframe.

df['Pay'] = df.apply(lambda x: calculate_pay(x['Vehicle'],x['Hours']),axis=1)

This creates a new column in your dataframe called 'Pay' and applies a function called calculate_pay which takes inputs vehicle and hours from the dataframe and returns the pay.

To print your results on screen you can simply just type df and enter if you are using jupyter notebook, or to select specific columns you mentioned in comments you can do the following.

df[['Id','Vehicle','Hours','Pay']]

To save back to excel you can do the following

df.to_excel('output.xslx')

edited Nov 3, 2021 at 13:32

answered Oct 29, 2021 at 14:33

SSS

3001 gold badge4 silver badges10 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

cchev Over a year ago

Hi, thank you for your answer! I decided to use the function as it makes the code seem a lot cleaner and easier to read. When I run the code, it comes up with a unexpected EOF error for line 14 which is empty. How can I resolve this? Also I'd like to print the results out so that it shows ID, Vehicle, Hours and then the new column Total Pay. Will I be able to export this back into excel as well?

cchev Over a year ago

I figured out the EOF Error but now it comes up with ValueError

SSS Over a year ago

@cchev Please see the edit i made when creating new column 'Pay', using lambda now will stop the value error. To print the results you see the other edits i have made towards the end of the answer.

Collectives™ on Stack Overflow

Calculations using pandas

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related