I am trying to trim some trade reports: Original report looks as :
AssetClass Symbol UnderlyingSymbol Multiplier Strike Expiry Put/Call DateTime
Quantity TradePrice Commission Buy/Sell
OPT ADBE 200221C00385000 ADBE 100 385 20200221 C 20200218,114515 1 1.4 2.5 BUY
OPT ADBE 200221C00385000 ADBE 100 385 20200221 C 20200218,114515 2 1.31 4.5 BUY
I would like to aggregate it as follow:
AssetClass Symbol UnderlyingSymbol Multiplier Strike Expiry Put/Call DateTime
Quantity TradePrice Commission Buy/Sell
OPT ADBE 200221C00385000 ADBE 100 385 20200221 C 20200218,114515 3 1.34 7 BUY
So a groupby on columns Symbol and Buy/Sell, with a sum function applied on Quantity and Commission and a weighted average on column TradePrice.
df = pd.read_csv(filename)
wm = lambda x: np.average(x, weights=df.loc[x.index, "Quantity"])
f = {'Quantity': 'sum', 'Commission': 'sum'}
df.groupby(['Symbol', 'Buy/Sell']).agg(f)
I have multiple issues
the output "forgets" the other columns and if I add these columns in the groupby, I get some blanks here and there
how can I apply the function wm to the TradePrice column?
for the DateTime column (format is "yyymmdd , hhmmss"), I would like to get just the date (which is the same for all rows)
Here is an output when I add the AssetClass column for instance:
Quantity Commission
AssetClass Symbol Buy/Sell
OPT ACN 200221P00212500 SELL -3 0.003649
ACN 200320C00215000 BUY 9 -6.694200
ACN 200320P00215000 BUY 9 -6.694200
XYZ 200221C00385000 BUY 2 -1.677600
SELL -4 -1.794891