I have a dataframe like below:
from scipy import optimize
import pandas as pd
import numpy as np
df = pd.DataFrame({ 'customer':['A','A','B', 'B', 'C', 'C', 'D', 'D'],
'num_of_visits': [1,2,1,2,1,2,1,2],
'gain': [2,5, 3,4, 6,8, 5,10] })
customer num_of_visits gain
A 1 2
A 2 5
B 1 3
B 2 4
C 1 6
C 2 8
D 1 5
D 2 10
This is just a example. In reality, we have many many customers. For each customer, sales rep can make one visit or two visits. Sales gain from 1 or 2 visit can be found on column gain.
For example, visit customer A once, the sales gain is 2, visit customer A twice, the sales gain is 5, etc.
The goal is to find the optimal set of customers for the sales rep to visit and their corresponding number of visits, to maximize the total gain.
The constraint:
- total number of visit is 4.
- one instance from each customer (either 1 or 2 visit)
This is a much simplified example, we can see the answer is: visit B once, visit C once, and visit D twice.
To find a generalized solution, we feel like this is an optimization problem. We should be able to use python scipy.optimize to find the solution.
I'm new to optimization. We need to maximize the sum of the column gain. Unlike optimizing a function with variable, how should I write the objective function? How should I implement the constraint to ensure one instance per customer?
I have been thinking about it for hours and still do not have a clue.
Appreciate it anyone can help with how to deal with this optimization problem.