1

I am trying to trim some trade reports: Original report looks as :

AssetClass  Symbol  UnderlyingSymbol    Multiplier  Strike  Expiry  Put/Call    DateTime     
Quantity    TradePrice  Commission  Buy/Sell
OPT ADBE  200221C00385000   ADBE    100 385 20200221    C   20200218,114515 1   1.4 2.5 BUY
OPT ADBE  200221C00385000   ADBE    100 385 20200221    C   20200218,114515 2   1.31    4.5 BUY

I would like to aggregate it as follow:

AssetClass  Symbol  UnderlyingSymbol    Multiplier  Strike  Expiry  Put/Call    DateTime     
Quantity    TradePrice  Commission  Buy/Sell
OPT ADBE  200221C00385000   ADBE    100 385 20200221    C   20200218,114515 3   1.34    7   BUY

So a groupby on columns Symbol and Buy/Sell, with a sum function applied on Quantity and Commission and a weighted average on column TradePrice.

df = pd.read_csv(filename)
wm = lambda x: np.average(x, weights=df.loc[x.index, "Quantity"])
f = {'Quantity': 'sum', 'Commission': 'sum'}
df.groupby(['Symbol', 'Buy/Sell']).agg(f)

I have multiple issues

  1. the output "forgets" the other columns and if I add these columns in the groupby, I get some blanks here and there

  2. how can I apply the function wm to the TradePrice column?

  3. for the DateTime column (format is "yyymmdd , hhmmss"), I would like to get just the date (which is the same for all rows)

Here is an output when I add the AssetClass column for instance:

                                               Quantity  Commission
AssetClass Symbol                Buy/Sell                                     
OPT        ACN   200221P00212500 SELL            -3      0.003649
           ACN   200320C00215000 BUY              9     -6.694200
           ACN   200320P00215000 BUY              9     -6.694200
           XYZ  200221C00385000 BUY              2     -1.677600
                                 SELL            -4     -1.794891

1 Answer 1

1

For remove times ffrom columnDatetime use Series.str.split:

df['DateTime'] = df['DateTime'].str.split(',').str[0]

For add new function add it to dictionary like another functions:

wm = lambda x: np.average(x, weights=df.loc[x.index, "Quantity"])
f = {'Quantity': 'sum', 'Commission': 'sum', 'TradePrice':wm}

Last if need avoid lost columns and same values per groups of Symbol and Buy/Sell columns is possible add it to groupby:

cols = ['AssetClass', 'Symbol', 'UnderlyingSymbol', 'Multiplier', 'Strike',
         'Expiry', 'Put/Call', 'DateTime', 'Buy/Sell']
df1 = df.groupby(cols).agg(f).reset_index()
print (df1)
  AssetClass                Symbol UnderlyingSymbol  Multiplier  Strike  \
0        OPT  ADBE 200221C00385000             ADBE         100     385   

     Expiry Put/Call  DateTime Buy/Sell  Quantity  Commission  TradePrice  
0  20200221        C  20200218      BUY         3         7.0        1.34 

If columns names are not same per groups of Symbol and Buy/Sell columns is necessary specify for each column aggregate function and add to dictionary e.g. for AssetClass is added first and for Multiplier is used mean:

df['DateTime'] = df['DateTime'].str.split(',').str[0]

wm = lambda x: np.average(x, weights=df.loc[x.index, "Quantity"])

f = {'Quantity': 'sum', 
     'Commission': 'sum', 
     'TradePrice':wm, 
     'AssetClass':'first', 
     'Multiplier':'mean', ....}

df2 = df.groupby(['Symbol', 'Buy/Sell']).agg(f).reset_index()
print (df2)
                 Symbol Buy/Sell  Quantity  Commission  TradePrice AssetClass  \
0  ADBE 200221C00385000      BUY         3         7.0        1.34        OPT   

   Multiplier  
0         100   
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.