1

I'm having problems with regex conversion from Perl to Python. I have this code:

my $str_old = my $str = '$x start $x bar$xy $$x end $x';

$str =~ s/(\s|^)(\$x)(\s|$)/$1\$Q$3/g;

print $str_old."\n";
print $str."\n";

and it should output this

$x start $x bar$xy $$x end $x
$Q start $Q bar$xy $$x end $Q

But I just can't get it working in Python.

3
  • 4
    What's is the line of code you've been using in python? Commented Mar 29, 2014 at 11:35
  • As says @Jerry, please show your python code Commented Mar 29, 2014 at 11:36
  • I got into this state, but it remove white spaces> import re str_old = str = '$x start $x bar$xy $$x end $x'; str = re.sub( '(\s|^)(\$x)(\s|$)' , '$Q', str) print (str) print (str_old) Commented Mar 29, 2014 at 11:41

2 Answers 2

1

Try this:

import re
str_old = str = '$x start $x bar$xy $$x end $x'  # You can remove this semicolon
 # str = re.sub( '(\s|^)(\$x)(\s|$)' , '$Q', str)
 #                                      ^^ You're not placing the spaces back.
str = re.sub( '(\s|^)\$x(\s|$)' , '\\1$Q\\2', str)

print(str)
print(str_old)

That said, you should raw your regex string and replacement string:

str = re.sub(r'(\s|^)\$x(\s|$)' , r'\1$Q\2', str)

And last, avoid using the variable name str in python. There is a function named str already:

import re
str_old = s = '$x start $x bar$xy $$x end $x'

str = re.sub(r'(\s|^)\$x(\s|$)' , r'\1$Q\2', str)

print(str)
print(str_old)

If you don't want to use the backrefereces, you can use lookarounds:

import re
str_old = s = '$x start $x bar$xy $$x end $x'  # You can remove this semicolon

str = re.sub(r'(?:(?<=\s)|(?<=^))\$x(?=\s|$)' , '$Q', str)
           # Since you don't have backreferences, you can now drop the rawing.

print(str)
print(str_old)

(?:(?<=\s)|(?<=^)) makes sure that the $x is preceded by a space or the beginning of the string;

(?=\s|$) makes sure that the $x is followed by a space or the end of the string.

Sign up to request clarification or add additional context in comments.

3 Comments

@xfrog Since you want it to work for $x $x, use the last code block here.
thank you, it works. I just had to change it a little to: re.sub(r'(?:(?<=\s)|(?<=^))(\$x)(?=\s|$)' , '$Q', s) because of "look-behind requires fixed-width pattern" error, now it works correctly. Thank you very much! :)
@xfrog Oh, you're right! Forgot about that with python.
0
>>> import re
>>> re.sub(r'(\s|^)(\$x)(\s|$)', r'\1$Q\3', '$x start $x bar$xy $$x end $x')
'$Q start $Q bar$xy $$x end $Q'

2 Comments

Thank you, this is exactly what I was looking for, but I just realised that the original regex won't work with string like this "$x $x" (only the first $x will sub). Isn't there a nice way how to correct it?
@xfrog: describe in plain English what you would like your regex to do, provide example input/output, your current attempt (a complete minimal example): what happens? what would you like to happen instead? And ask it as a new question

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.