This is my current function:
def partnerTransaction(main_df, ptn_code, intent, retail_unique):
if intent == 'Frequency':
return main_df.query('csp_code == @retail_unique & partner_code == @ptn_code')['tx_amount'].count()
elif intent == 'Total_value':
return main_df.query('csp_code == @retail_unique & partner_code == @ptn_code')['tx_amount'].sum()
What it does is that it accepts a Pandas DataFrame (DF 1) and three search parameters. The retail_unique is a string that is from another dataframe (DF 2). Currently, I iterate over the rows of DF 2 using itertuples and call around 200 such functions and write to a 3rd DF, this is just an example. I have around 16000 rows in DF 2 so its very slow. What I want to do is vectorize this function. I want it to return a pandas series which has count of tx_amount per retail unique. So the series would be
34 # retail a
54 # retail b
23 # retail c
I would then map this series to the 3rd DF.
Is there any idea on how I might approach this?
EDIT: The first DF contains time based data with each retail appearing multiple times in one column and the tx_amount in another column, like so
Retail tx_amount
retail_a 50
retail_b 100
retail_a 70
retail_c 20
retail_a 10
The second DF is arranged per retailer:
Retail
retail_a
retail_b
retail_c