0

I have a pandas dataframe that looks like :

>>> df
      product   desc
0        ABCD  desc1
1   ABCD1,XYZ  desc2
2      ABCD1H  desc3
3       ABCD1  desc4
4  ABCD1H,LMN  desc5

I want to filter out rows that have products ABCD1 or ABCD1 followed by any other product ID but not ABCD1H. How to filter out such rows. In the above example , I want the output as :

>>> df
          product   desc
    1   ABCD1,XYZ  desc2
    3       ABCD1  desc4

This is what I have tried so far but that does not work .

df2 = df.loc[df['product'].str.contains('ABCD1')]

It also includes ABCD1H in its results, i don't want that to happen.

1 Answer 1

2

Use regex "\b" is word break:

df[df['product'].str.contains(r'ABCD1\b')]

Output:

     product   desc
1  ABCD1,XYZ  desc2
3      ABCD1  desc4
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.