1

I'm trying to use LOC with an AND condition. It works fine with OR conditions, but I can't get it to work with ANDs when there are duplicate values in a column.

def locreplace(df,col,needle,replace,needle2=''):
   if (needle2==''):
      df.loc[df[col].str.contains(needle, case=False)==True,col] = replace
   else:
     df.loc[[df[col].str.contains(needle, case=False)==True] and df[col].str.contains(needle2, case=False)==True,col] = replace

This table with no duplicates works as expected:

#Create a data frame
data = [['granny apple', 'juicy'], ['blood orange', 'refreshing'], ['spanish lemon', 'tangy']]
fruitdf = pd.DataFrame(data, columns = ['fruit', 'taste'])

#Single replace - works
#locreplace(fruitdf,'fruit','apple','big red nice apple')

#Will fail - works
#locreplace(fruitdf,'fruit','apple','big red apple','uncle')

#Double replace - works
locreplace(fruitdf,'fruit','apple','big huge red apple','granny')

But when you create a data frame with two "granny" entries the double replace AND condition replaces both instances of "granny" even though "apple" in the AND condition isn't being matched.

data = [['granny apple', 'juicy'], ['granny blood orange', 'refreshing'], ['spanish lemon', 'tangy']]
fruitdf = pd.DataFrame(data, columns = ['fruit', 'taste'])

#Single replace - works
#locreplace(fruitdf,'fruit','apple','big red nice apple')

#Will fail - works
#locreplace(fruitdf,'fruit','apple','big red apple','uncle')


#Double replace - fails
locreplace(fruitdf,'fruit','apple','big huge red apple','granny')


No doubt my fault, and a misplacing of brackets (or misunderstanding of code), but what is the correct way to achieve an AND condition replace with loc (or other easier method)?

Current output:

    fruit   taste
0   big huge red apple  juicy
1   big huge red apple  refreshing
2   spanish lemon   tangy

Desired output:

    fruit   taste
0   big huge red apple  juicy
1   granny blood orange refreshing
2   spanish lemon   tangy

1 Answer 1

1

The issue is in the else block in locreplace

[df[col].str.contains(needle, case=False)==True]

Which is a list with a series in the first index rather than a series. You need to remove the brackets and replace and with &

df.loc[df[col].str.contains(needle, case=False) & df[col].str.contains(needle2, case=False), col] = replace

Output

                 fruit       taste
0   big huge red apple       juicy
1  granny blood orange  refreshing
2        spanish lemon       tangy
Sign up to request clarification or add additional context in comments.

1 Comment

thanks Guy - it's very hot here I'm having a slow day, appreciate the quick response, worked first time.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.