0

Is there a more efficient way to do the following? Ideally, by using only one if statement?

Suppose there is a dataframe with an "author" series, a "comedy" series (default = True), and a "horror" series (default = False). I want to search the author series for "stephen king" and "lovecraft" and in those cases change the value of "comedy" from True to False and change the value of "horror" from False to True.

for count,text in enumerate(df.loc[0:, "author"]):
    if "stephen king" in str(text):
        df.loc[count, "comedy"] = False
        df.loc[count, "horror"] = True
        continue
    elif "lovecraft" in str(text):
        df.loc[count, "comedy"] = False
        df.loc[count, "horror"] = True
        continue

When I try using str.contains(), I get the error "str' object has no attribute 'str'".

3 Answers 3

1

Don't enumerate a data frame, index and slice it.

ix = df.author.str.contains('stephen king')
df.loc[ix, 'comedy'] = False
df.loc[ix, 'horror'] = True

ix = df.author.str.contains('lovecraft')
df.loc[ix, 'comedy'] = False
df.loc[ix, 'horror'] = True
Sign up to request clarification or add additional context in comments.

2 Comments

In the first example, is ix the index of those that do contain stephen king?
Yes. Stephen King for the first block, and Lovecraft for the second.
1

You can assign these values with df.loc. If string contains 'stephen king' or 'lovecraft', put False in column 'comedy' and True in column 'horror':

df.loc[df['author'].str.contains('stephen king|lovecraft'), 
       ['comedy', 'horror']] = [False, True]

Comments

1

You can check that the column contains of several values by using

df['Author'].str.contains('|'.join(list_of_authors)

then assign the values using loc.

Ex;

>>> df
          Author  Comedy  Horror
0   stephen king    True   False
1      lovecraft    True   False
2  jonathan ames    True    True
3   stephen king   False    True
4          oprah    True   False
>>> df.loc[df['Author'].str.contains('|'.join(['stephen king','lovecraft']),case=False,na=False),('Comedy','Horror')]=False,True
>>> df
          Author  Comedy  Horror
0   stephen king   False    True
1      lovecraft   False    True
2  jonathan ames    True    True
3   stephen king   False    True
4          oprah    True   False

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.