2

I just asked the following question

Pandas: how can I pass a column name to a function that can then be used in 'apply'?

To which I received a great answer. However, there is an extension to this question that I overlooked and also am curious about.

I have a function:

def generate_confusion_matrix(row):
val=0
if (row['biopsy_bin']==0) & (row['pioped_logit_category'] == 0):
    val = 0   
if (row['biopsy_bin']==1) & (row['pioped_logit_category'] == 1):
    val = 1 
if (row['biopsy_bin']==0) & (row['pioped_logit_category'] == 1):
    val = 2
if (row['biopsy_bin']==1) & (row['pioped_logit_category'] == 0):
    val = 3
if row['pioped_logit_category'] == 2:
    val = 4
return val  

I wish to make it generic like this:

def general_confusion_matrix(biopsy, column_name):
val=0
if biopsy==0:
    if column_name == 0:
        val = 0
    elif column_name == 1:
        val = 1
elif biopsy==1:
    if column_name == 1:
        val = 2 
    elif column_name == 0:
        val = 3
elif column_name == 2:
    val = 4
return val 

so that I can apply it in this function something like this (this does not work).

def create_logit_value(df, name_of_column):
   df[name_of_column + '_concordance'] = df.apply(lambda : general_confusion_matrix('biopsy', name_of_column + '_category'), axis=1)

The issue seems to be that when you pass the columns in as df['biopsy'] you are passing a series to the general_confusion_matrix function rather than a value at each row and the conditional statements throw and the usual

   ('The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().', 'occurred at index 0')"

I have tried both map and apply but I am not sure how I can pass 2 arguments that refer to columns in my dataframe to the function in the lambda statement. I guess I could use map, but again, how do I pass the arguments through it. I apologise for writing 2 closely related questions but they are different.

1 Answer 1

6

I think you are close:

df = pd.DataFrame({'biopsy_bin':[0,1,0,1,0,1],
                   'pioped_logit_category':[0,0,0,1,1,1],
                   'a_category':[0,0,0,1,1,1]})
print (df)


def create_logit_value(df, name_of_column):
    df[name_of_column + '_concordance'] = df.apply(lambda x: generate_confusion_matrix(x['biopsy_bin'], x[name_of_column + '_category']), axis=1)
    return (df)

create_logit_value(df, 'a')
create_logit_value(df, 'pioped_logit')

   a_category  biopsy_bin  pioped_logit_category  a_concordance  \
0           0           0                      0              0   
1           0           1                      0              3   
2           0           0                      0              0   
3           1           1                      1              2   
4           1           0                      1              1   
5           1           1                      1              2   

   pioped_logit_concordance  
0                         0  
1                         3  
2                         0  
3                         2  
4                         1  
5                         2  
Sign up to request clarification or add additional context in comments.

1 Comment

Glad can help! Nice weekend!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.