I'm trying to calculate the Bonus Pay given to our Courier's based on their Success Rate which is based off the number of Eligible orders they've successfully delivered.
Here's my code:
from openpyxl import load_workbook
import pandas as pd
df = pd.read_excel(r'path\filename.xlsx')
df['DeliveredAt'] = pd.to_datetime(df['DeliveredAt'].astype(str))
df['Date'] = df['DeliveredAt'].dt.strftime('%d/%m/%y')
df['Total Orders'] = df['OrderNumber']
df['Total Orders'] = 1
df['Eligible'] = df['DeliveryOnTime'].isin(["On-time", "Early"])
df['Success Rate'] = (df['Eligible']/df['Total Orders'])*100
df['Bonus'] = df['Eligible'].mul(1.2).where(df['Success Rate'] >= 95)
per_day = df.groupby(['Date', 'Courier']).agg({'Success Rate': 'mean', 'Total Orders': 'count', 'Eligible': 'sum', 'Incentive': 'sum'})
per_courier = df.groupby('Courier').agg({'Total Orders': 'count', 'Incentive': 'sum'})
I'm having issues with
df['Bonus'] = df['Eligible'].mul(1.2).where(df['Success Rate'] >= 95)
It seems to ignore the condition that Success Rate has to be over 95 for it to qualify for a Bonus. Right now, each row has a Bonus pay which is calculated from the number of Eligible * 1.2 regardless of the success rate value.
Sample Data:
data = {'ID': [1, 1, 1, 2, 2, 3, 4, 5, 5],
'DeliveryOnTime': ["On-time", "Late", "Early", "On-time", "On-time", "Late", "Early", "Early", "Late"],
}
df = pd.DataFrame(data)
#For the sake of example data, the count of `DeliveryOnTime` will be the total number of orders.
df['Total Orders'] = df['DeliveryOnTime'].count()
df['Eligible'] = df['DeliveryOnTime'].isin(["On-time", "Early"])
df['Success Rate'] = (df['Eligible']/df['Total Orders'])*100
df['Bonus'] = df['Eligible'].mul(1.2).where(df['Success Rate'] >= 95)
Current Output
ID Eligible Total Orders Success Rate Bonus
1 2 3 66.66 2.4
2 2 2 100 2.4
3 0 1 0 0
4 1 1 100 1.2
5 1 2 50 1.2
This is wrong as ID 1 and 5 should not have any Bonus at all with the Success Rate being lower than 95.
Expected Output
ID Eligible Total Orders Success Rate Bonus
1 2 3 66.66 0
2 2 2 100 2.4
3 0 1 0 0
4 1 1 100 1.2
5 1 2 50 0
I'm not sure if it's something I have to change in the Per_day or Per_courier section of the .groupby() as that seems to be where most of my issues have occurred when writing this code.
Thanks