I just asked the following question
Pandas: how can I pass a column name to a function that can then be used in 'apply'?
To which I received a great answer. However, there is an extension to this question that I overlooked and also am curious about.
I have a function:
def generate_confusion_matrix(row):
val=0
if (row['biopsy_bin']==0) & (row['pioped_logit_category'] == 0):
val = 0
if (row['biopsy_bin']==1) & (row['pioped_logit_category'] == 1):
val = 1
if (row['biopsy_bin']==0) & (row['pioped_logit_category'] == 1):
val = 2
if (row['biopsy_bin']==1) & (row['pioped_logit_category'] == 0):
val = 3
if row['pioped_logit_category'] == 2:
val = 4
return val
I wish to make it generic like this:
def general_confusion_matrix(biopsy, column_name):
val=0
if biopsy==0:
if column_name == 0:
val = 0
elif column_name == 1:
val = 1
elif biopsy==1:
if column_name == 1:
val = 2
elif column_name == 0:
val = 3
elif column_name == 2:
val = 4
return val
so that I can apply it in this function something like this (this does not work).
def create_logit_value(df, name_of_column):
df[name_of_column + '_concordance'] = df.apply(lambda : general_confusion_matrix('biopsy', name_of_column + '_category'), axis=1)
The issue seems to be that when you pass the columns in as df['biopsy'] you are passing a series to the general_confusion_matrix function rather than a value at each row and the conditional statements throw and the usual
('The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().', 'occurred at index 0')"
I have tried both map and apply but I am not sure how I can pass 2 arguments that refer to columns in my dataframe to the function in the lambda statement. I guess I could use map, but again, how do I pass the arguments through it. I apologise for writing 2 closely related questions but they are different.