3

I have a string

"abc INC\","None", "0", "test"

From this string I want to replace any occurrence of backslash when it appears before " with a pipe |. I wrote the following code but it actually takes out " and leaves the \ behind.

import re
str = "\"abc INC\\\",\"None\", \"0\", \"test\""
str = re.sub("(\\\")", "|", str)
print(str)

Output: |abc INC\|,|None|, |0|, |test|
Desired Output: "abc INC|","None", "0", "test"

Can someone point out what am I doing wrong?

6
  • Don't know python, but you can use this regex \\(?=") Commented Jul 29, 2016 at 14:51
  • please use ' to delimite your python string if there are " inside, it will be clearer to see what are your strings Commented Jul 29, 2016 at 14:51
  • @MosesKoledoye it is a complete string. Read the code which has escape sequence. Commented Jul 29, 2016 at 14:53
  • @jotasi here is the output |"abc INC\|",|"None|", |"0|", |"test|" Commented Jul 29, 2016 at 14:56
  • @Jacquot str = '\"abc INC\\\",\"None\", \"0\", \"test\"' Commented Jul 29, 2016 at 14:58

4 Answers 4

3

See Jamie Zawinksi's famous quote about regular expressions. Try to only resort to the use of re's when absolutely necessary. In this case, it isn't.

The actual content of string str (bad name for a variable, by the way, since there's a built-in type of that name) is

"abc INC\","None", "0", "test"

Why not just

str.replace('\\"', '|"')

which will do exactly what you want.

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you. I didn't think about it. And I will keep in mind "Try to only resort to the use of re's when absolutely necessary."
0

You can use the following positive lookahead assertion '\\(?=")':

import re

my_str = "\"abc INC\\\",\"None\", \"0\", \"test\""
p = re.sub(r'\\(?=")', '|', my_str)
print(p)
# '"abc INC|","None", "0", "test"'

Try not to use builtin names as names for variables, viz. str, to avoid shadowing the builtin.

2 Comments

That did the job. Any good tutorial where I can learn this bad boy?
@r0xette You can start with the re docs. It has a lot of useful details with a few examples :))
0

This must solve your problem:

import re
s = "\"abc INC\\\",\"None\", \"0\", \"test\""
s = re.sub(r"\\", "|", s)

Also don't use str as a variable name, it is a reserved keyword.

Comments

0

For literal backslashes in python regexes you need to escape twice, giving you the pattern '\\\\"' or "\\\\\"". The first escaping is needed for python to actually put a backslash into the string. But regex patterns themself use backshlashes as a special character (for things like \w word characters, etc.). The documentation states:

The special sequences consist of '\' and a character from the list below. If the ordinary character is not on the list, then the resulting RE will match the second character.

So the pattern \" will match a single " because " is not a character with a special meaning there.

You can use the raw notation to only escape once: r'\\"'.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.