Multi-column calculation in pandas

Question

I've got this long algebra formula that I need to apply to a dataframe:

def experience_mod(A, B, C, D, T, W):
    E = (T-A)
    F = (C-D)

    xmod = (A + B + (E*W) + ((1-W)*F))/(D + B + (F*W) + ((1-W)*F))

    return xmod

A = loss['actual_primary_losses']
B = loss['ballast']
C = loss['ExpectedLosses']
D = loss['ExpectedPrimaryLosses']
T = loss['ActualIncurred']
W = loss['weight']

How would I write this to calculate the experience_mod() for every row?

something like this?

loss['ExperienceRating'] = loss.apply(experience_mod(A,B,C,D,T,W) axis = 0)

Pandas support vectorized operations, so if you have a dataframe A and a dataframe B, A - B, A + B, etc.. are valid operations. — Mohamed Ali JAMAOUI
– Mohamed Ali JAMAOUI, Commented Nov 21, 2018 at 17:01

Mohamed Ali JAMAOUI · Accepted Answer · 2018-11-21 17:14:03Z

Pandas and the underlying library, numpy, it's using, support vectorized operations, so given two dataframes A and B, operations like A + B, A - B etc are valid.

Your code works fine, you need to apply the function to the columns directly and assign the results back to the new column ExperienceRating,

Here's a working example:

In [1]: import pandas as pd 

In [2]: import numpy as np 

In [3]: df = pd.DataFrame(np.random.randn(6,6), columns=list('ABCDTW'))

In [4]: df
Out[4]: 
          A         B         C         D         T         W
0  0.049617  0.082861  2.289549 -0.783082 -0.691990 -0.071152
1  0.722605  0.209683 -0.347372  0.254951  0.468615 -0.132794
2 -0.301469 -1.849026 -0.334381 -0.365116 -0.238384 -1.999025
3 -0.554925 -0.859044 -0.637079 -1.040336  0.627027 -0.955889
4 -2.024621 -0.539384  0.006734  0.117628 -0.215070 -0.661466
5  1.942926 -0.433067 -1.034814 -0.292179  0.744039  0.233953

In [5]: def experience_mod(A, B, C, D, T, W):
   ...:     E = (T-A)
   ...:     F = (C-D)
   ...: 
   ...:     xmod = (A + B + (E*W) + ((1-W)*F))/(D + B + (F*W) + ((1-W)*F))
   ...: 
   ...:     return xmod
   ...: 

In [6]: experience_mod(df["A"], df["B"], df["C"], df["D"], df["T"], df["W"])
Out[6]: 
0    1.465387
1   -2.060483
2    1.000469
3    1.173070
4    7.406756
5   -0.449957
dtype: float64

In [7]: df['ExperienceRating'] = experience_mod(df["A"], df["B"], df["C"], df["D"], df["T"], df["W"])

In [8]: df
Out[8]: 
          A         B         C         D         T         W  ExperienceRating
0  0.049617  0.082861  2.289549 -0.783082 -0.691990 -0.071152          1.465387
1  0.722605  0.209683 -0.347372  0.254951  0.468615 -0.132794         -2.060483
2 -0.301469 -1.849026 -0.334381 -0.365116 -0.238384 -1.999025          1.000469
3 -0.554925 -0.859044 -0.637079 -1.040336  0.627027 -0.955889          1.173070
4 -2.024621 -0.539384  0.006734  0.117628 -0.215070 -0.661466          7.406756
5  1.942926 -0.433067 -1.034814 -0.292179  0.744039  0.233953         -0.449957

Collectives™ on Stack Overflow

Multi-column calculation in pandas

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related