1

I'd like to update values in a column ['D'] of my data frame based on the substrings contained in column ['A']

I have dataframe df, where column D is an exact copy of column A. Column A has list of different cereal names. But I want to check if A contains a certain word and if so, then column D updates to show that word.

For example, If A contains the word Bran then D should update to show Bran. And if column A contains none of the criteria we are looking for then it should return Other.

name Type of Cereal
Bran Cereal Bran
Fiber Cereal Fiber
Nut Cereal Nut

So far I have tried this, but I am not able to change the rest of the columns without any criteria words to 'Other'

df.loc[df['Type of Cereal'].str.contains('Fiber'), 'Type of Cereal'] = 'Fiber'

df.loc[df['Type of Cereal'].str.contains('Bran'), 'Type of Cereal'] = 'Bran'

df.loc[df['Type of Cereal'].str.contains('Nut'), 'Type of Cereal'] = 'Nut'

I have also tried creating a function, but I'm not sure how to loop through each value in column A

def conditions1(s):

    CerealTypeFiber = df['name'].str.contains('Fiber').sum()

    CerealTypeBran = df['name'].str.contains('Bran').sum()
    
    if CerealTypeFiber != None:
        return 'Fiber'
    
    elif CerealTypeBran != None:
        return 'Bran'
    
    else:
        return 'Other'
    
df['Cereal Category'] = df.apply(conditions1, axis=1) 

This just updates the whole column to Fiber.

Any thoughts/suggestions?

2 Answers 2

2

Other way using pandas.Series.str.extract:

df["Type of Cereal"] = df["name"].str.extract("(Bran|Fiber|Nut)").fillna("Other")
print(df)

Output (Note 4th row in case of no match):

           name Type of Cereal
0   Bran Cereal           Bran
1  Fiber Cereal          Fiber
2    Nut Cereal            Nut
3     Something          Other
Sign up to request clarification or add additional context in comments.

Comments

0

You might be able to get away with just using np.where with str.extract here:

df["Type of Cereal"] = np.where(df["name"].str.contains('cereal'),
                                df["name"].str.extract(r'^(.*) Cereal$'),
                                "Other")

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.