Trying to find the best way to generate an 'Other' row in my pandas dataframe. 'Other' is calculated by adding up all the Source values that are not 'Total' and then subtracting by the 'Total' values.
Ex: 'Other' = Total - (Souce_1 + Souce_2 + Souce_3)
Here's an example of what I am starting with:
| Name | Source | Lead | Sale |
|---|---|---|---|
| Prop_A | Source_1 | 100 | 3 |
| Prop_A | Source_2 | 50 | 5 |
| Prop_A | Source_3 | 20 | 0 |
| Prop_A | Total | 300 | 11 |
| Prop_B | Source_1 | 200 | 10 |
| Prop_B | Source_2 | 300 | 6 |
| Prop_B | Source_3 | 20 | 0 |
| Prop_B | Total | 700 | 23 |
And this is what I am try to create:
| Name | Source | Lead | Sale |
|---|---|---|---|
| Prop_A | Source_1 | 100 | 3 |
| Prop_A | Source_2 | 50 | 5 |
| Prop_A | Source_3 | 20 | 0 |
| Prop_A | Other | 130 | 3 |
| Prop_A | Total | 300 | 11 |
| Prop_B | Source_1 | 200 | 10 |
| Prop_B | Source_2 | 300 | 6 |
| Prop_B | Source_3 | 20 | 0 |
| Prop_B | Other | 180 | 7 |
| Prop_B | Total | 700 | 23 |
I was able to calculate the 'Other' row by using following code, but know this isn't the best way to do it. Wondering if anyone knows a better way?
Total_df = df[df['Source'] == 'Total']
All_Sources_df = df[df['Source'] != 'Total']
All_Sources_df = All_Sources_df.groupby(['Name'], as_index=False).sum()
result = pd.merge(Total_df, All_Sources_df, on=['Name'])
result['Lead'] = result['Lead_x'] - result['Lead_y']
result['Sale'] = result['Sale_x'] - result['Sale_y']
result = result[['Name', 'Lead', 'Sale']]
result['Source'] = 'Other'
result = result[['Name','Source','Lead','Sale']]