3

As my question states, I would like to invoke custom function on run-time to a dataframe. Use of custom function will be to calculate difference between two date (i.e. age), convert year to months, find max-min from two columns etc.

So Far, I succeeded in performing arithmetic operations and few functions like abs(), sqrt() but couldn't get min()-max() working.Things working are,

df.eval('TT = sqrt(Q1)',inplace=True)
df.eval('TT1 = abs(Q1-Q2)',inplace=True)
df.eval('TT2 = (Q1+Q2)*Q3',inplace=True)

Following code works with eval. How can I use the same with dataframe eval ?

def find_max(x,y):
    return np.maximum(x,y)

eval('max1')(4,7)

def find_age(date_col1,date_col2):
    return 'I know how to calc age but how to call func this with df.eval and assign to new col'

Sample dataframe:

op_d = {'ID': [1, 2,3],'V':['F','G','H'],'AAA':[0,1,1],'D':['2019/12/04','2019/02/01','2019/01/01'],'DD':['2019-12-01','2016-05-31','2015-02-15'],'CurrentRate':[7.5,2,2],'NoteRate':[2,3,3],'BBB':[0,4,4],'Q1':[2,8,10],'Q2':[3,5,7],'Q3':[5,6,8]}
df = pd.DataFrame(data=op_d)

Any help or link to Doc is appreciated.

helpful links I found but not addressing my issues are:

Dynamic Expression Evaluation in pandas using pd.eval()

Using local variables with multiple assignments with pandas eval function

Passing arguments to python eval()

1 Answer 1

8

Functions can be called as usual, you need to reference them with the @ synbol:

df                                                                  
   A  B
0  1  0
1  0  0
2  0  1

def my_func(x, y): return x + y                                     

df.eval('@my_func(A, B)')                                          
0    1
1    0
2    1
dtype: int64

Of course, the expectation here is that your functions expect series as arguments. Otherwise, wrap your function in a call to np.vectorize, as appropriate.

Sign up to request clarification or add additional context in comments.

6 Comments

If I have function as, def max1(x,y): return np.maximum(x,y) and call df.eval('@max1(TT,TT1)') , I get TypeError: 'Series' objects are mutable, thus they cannot be hashed . Do you know why is it a case?
I guess default engine 'numexpr' can't handle it. resolved it using engine='python'.
@Prish that seems right to me. Numexpr is generally faster but there are limitations to its use.
do you know how to call round,ceil,floor function from eval? i.e. df.eval('round(A-B)'). abs() and sqrt() works fine.
@Prish If you just need to round the result, doesn't df.eval('A-B').round() work?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.