1

I am trying to separate few string from pandas dataframe :

x = pd.DataFrame()
x['y'] = ["Hernia|Infiltration","A|Hernia|Infiltration","Infiltration|Hernia"]
x

I am executing below code :

x['y'] = x['y'].replace({'|Hernia': ''},regex=True)
x['y'] = x['y'].str.replace('Hernia|', '',regex=True)
x

But output is wrong :

wrong output :

     y
0   |Infiltration
1   A||Infiltration
2   Infiltration|

Correct/ Expected output

     y
0   Infiltration
1   A|Infiltration
2   Infiltration

There can be any string in place of A and Infiltration , but pattern would be same.

1
  • Is there a reason you're using regex=True when you're trying to replace a literal string rather than a regular expression? Commented Jul 5, 2019 at 15:08

2 Answers 2

3

This can probably be more elegantly handled with split/join

x['y'].apply(lambda row: '|'.join(x for x in row.split('|') if 'Hernia'!= x))

Output:

0      Infiltration
1    A|Infiltration
2      Infiltration
Sign up to request clarification or add additional context in comments.

2 Comments

It should be if 'Hernia' != x Otherwise, you'll remove Herniated disk as well.
You can use a comprehension instead of apply... ['|'.join(filter('Hernia'.__ne__, s.split('|'))) for s in x.y]
3

You need to escape | in replace:

x['y'] = x['y'].replace({'\|Hernia': ''},regex=True)
x['y'] = x['y'].replace({'Hernia\|': ''},regex=True)

Taking from @user3483203 and @piRSquared's comments, you can join them with | acting as an or:

x['y'].replace({'\|Hernia|Hernia\|': '',
                '':''},regex=True, inplace=True)

2 Comments

Since .replace() takes a dictionary, you can put both keys in the same dict.
But you can separate them with a '|'... x['y'].replace({'\|Hernia|Hernia\|': ''},regex=True)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.