how to use lambda function with transformation in pandas dataframe

Question

I use the following dataframe

df = pd.DataFrame({'class': 'a a aa aa b b '.split(),
                    'item': [5,5,7,7,7,6],
                   'last_PO_code': ['103','103','103','104','103','104'],
                   'qty': [5,4,7,6,7,6]
                   })

I need to apply rules to this dataframe for each class in each item.

true if all last_PO_code are equal to 103
true if last_PO_code contains 103 and 104 and sum of qty 103 > sum qty of 104
true if there is a last_PO_code equal to 103 and 104 and 105 and 106 and the sum of the qty of 104 == 103 and 105 == 106

I have written lambda functions that I can't use with transform

regle1 = lambda x: True if x['last_PO_code'].all() == "103" else False
regle2 = lambda x: x.loc[x['last_PO_code'].eq('103'), 'qty'].sum() \
                   > x.loc[x['last_PO_code'].eq('104'), 'qty'].sum()
regle3 = lambda x: x.loc[x['last_PO_code'].eq('105'), 'qty'].sum() \
                   == x.loc[x['last_PO_code'].eq('106'), 'qty'].sum()

df['regle1'] = df['class'].map(df.groupby(['class','item']).apply(regle1))
df['regle2'] = df['class'].map(df.groupby(['class','item']).apply(regle2))
df['regle3'] = df['class'].map(df.groupby(['class','item']).apply(regle3))
mask1 = df['regle2'] == True 
mask2 = df['regle3'] == True 
mask = mask1 & mask2
df['regle3'] = np.where(mask,True,False)

which I would like to transform into a function like the following to use transform and not apply

I succeeded with rule 1 but I can't manage with the other rules

def regle1(x):
      return (x == '103').all()


df['regle1'] = df.groupby(['class', 'item']).last_PO_code.transform(regle1)

Daniel Wlazło · Accepted Answer · 2022-01-06 12:11:35Z

1

You mean something like that:

regle1 = lambda x: True if x['last_PO_code'].eq('103').all() else False
regle2 = lambda x: True if x['last_PO_code'].eq('103').any() \
    and x['last_PO_code'].eq('103').any() \
    and x['last_PO_code'].eq('103').sum() > x['last_PO_code'].eq('104').sum() \
    else False
regle3 = lambda x: True if x['last_PO_code'].eq('103').any() \
    and x['last_PO_code'].eq('104').any() \
    and x['last_PO_code'].eq('105').any() \
    and x['last_PO_code'].eq('106').any() \
    and x['last_PO_code'].eq('103').sum() == x['last_PO_code'].eq('104').sum() \
    and x['last_PO_code'].eq('105').sum() == x['last_PO_code'].eq('106').sum() \
    else False

And then applying them to each group:

df2 = df.groupby(['class','item']).apply(lambda x: pd.Series({'regle1' : regle1(x),
                                  'regle2': regle2(x),
                                  'regle3' : regle3(x)}))

for

df = pd.DataFrame({'class': 'a a aa aa b b c c c c'.split(),
                    'item': [5,5,7,7,7,6,9,9,9,9],
                   'last_PO_code': ['103','103','103','104','103','104','103','104','105','106'],
                   'qty': [5,4,7,6,7,6,1,1,2,2]
                   })

It seems to working fine:

                regle1  regle2  regle3
class   item            
a       5       True    False   False
aa      7       False   True    False
b       6       False   False   False
        7       True    False   False
c       9       False   False   True

EDIT: You can add calculated columns for example with pd.merge()

df.merge(df2.reset_index(), on = ['class','item'])

#   class   item    last_PO_code    qty regle1  regle2  regle3
#0  a       5       103             5   True    True    False
#1  a       5       103             4   True    True    False
#2  aa      7       103             7   False   False   False
#3  aa      7       104             6   False   False   False
#4  b       7       103             7   True    True    False
#5  b       6       104             6   False   False   False
#6  c       9       103             1   False   False   True
#7  c       9       104             1   False   False   True
#8  c       9       105             2   False   False   True
#9  c       9       106             2   False   False   True

edited Jan 6, 2022 at 12:11

answered Jan 6, 2022 at 0:58

Daniel Wlazło

1,1551 gold badge8 silver badges18 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Lyès SARDI Over a year ago

Thanks for your answer but the resulting dataframe is not good because it does not contain all the resulting rows

Daniel Wlazło Over a year ago

OK, but once you have it is easy to add them to start dataset (either via merge, or map). I thought that the biggest problem were the lambda functions.

Collectives™ on Stack Overflow

how to use lambda function with transformation in pandas dataframe

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related