0

I am struggling to loop a lambda function across multiple columns.

samp = pd.DataFrame({'ID':['1','2','3'], 'A':['1C22', '3X35', '2C77'],
                     'B': ['1C35', '2C88', '3X99'], 'C':['3X56', '2C73', '1X91']})

Essentially, I am trying to add three columns to this dataframe with a 1 if there is a 'C' in the string and a 0 if not (i.e. an 'X').

This function works fine when I apply it as a lambda function to each column individually, but I'm doing so to 40 differnt columns and the code is (I'm assuming) unnecessarily clunky:

def is_correct(str):
    correct = len(re.findall('C', str))
    return correct

samp.A_correct=samp.A.apply(lambda x: is_correct(x))
samp.B_correct=samp.B.apply(lambda x: is_correct(x))
samp.C_correct=samp.C.apply(lambda x: is_correct(x))

I'm confident there is a way to loop this, but I have been unsuccessful thus far.

2
  • Related, you may want to look into this: pandas.pydata.org/pandas-docs/stable/reference/api/…. It should be much faster than using lamdas Commented Oct 5, 2020 at 14:52
  • thank you for this. My initial aversion to this approach is that the next thing I', trying to loop is extracting the number following the 'C' or 'X', which I'm doing similarly using a lambda function. My goal is a broader conceptual understanding for how to loop the same function across multiple columns. Commented Oct 5, 2020 at 16:31

2 Answers 2

1

You can iterate over the columns:

import pandas as pd
import re

df = pd.DataFrame({'ID':['1','2','3'], 'A':['1C22', '3X35', '2C77'],
                     'B': ['1C35', '2C88', '3X99'], 'C':['3X56', '2C73', '1X91']})
def is_correct(str):
    correct = len(re.findall('C', str))
    return correct

for col in df.columns:
    df[col + '_correct'] = df[col].apply(lambda x: is_correct(x))
Sign up to request clarification or add additional context in comments.

2 Comments

this produces a type error: 'Index' object is not callable.
I have added the answer above
0

Let's try apply and join:

samp.join(samp[['A','B','C']].add_suffix('_correct')
                .apply(lambda x: x.str.contains('C'))
                .astype(int)
        ) 

Output:

  ID     A     B     C  A_correct  B_correct  C_correct
0  1  1C22  1C35  3X56          1          1          0
1  2  3X35  2C88  2C73          0          1          1
2  3  2C77  3X99  1X91          1          0          0

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.