20

I'm trying to escape several special characters in a given string using perl regex. It works fine for all characters except for the dollar sign. I tried the following:

my %special_characters;
$special_characters{"_"} = "\\_";
$special_characters{"$"} = "\\$";
$special_characters{"{"} = "\\{";
$special_characters{"}"} = "\\}";
$special_characters{"#"} = "\\#";
$special_characters{"%"} = "\\%";
$special_characters{"&"} = "\\&";

my $string = '$foobar';
foreach my $char (keys %special_characters) {
  $string =~ s/$char/$special_characters{$char}/g;
}
print $string;

3 Answers 3

23

Try this:

my %special_characters;
$special_characters{"_"} = "\\_";
$special_characters{"\\\$"} = "\\\$";
$special_characters{"{"} = "\\{";
$special_characters{"}"} = "\\}";
$special_characters{"#"} = "\\#";
$special_characters{"%"} = "\\%";
$special_characters{"&"} = "\\&";

Looks weird, right? Your regex needs to look as follows:

s/\$/\$/g

In the first part of the regex, "$" needs to be escaped, because it's a special regex character denoting the end of the string.

The second part of the regex is considered as a "normal" string, where "$" doesn't have a special meaning. Therefore the backslash is a real backslash whereas in the first part it's used to escape the dollar sign.

Furthermore in the variable definition you need to escape the backslash as well as the dollar sign, because both of them have special meaning in double-quoted strings.

Sign up to request clarification or add additional context in comments.

1 Comment

Better approach: use quotemeta() or s/\Q$char\E/... You should remember do this for every $variable, since regexps interpolate them.
3

You don't need a hash if you're replacing each character with itself preceded by a backslash. Just match what you need and put a backslash in front of it:

s/($re)/"\\$1"/eg;

To build up the regular expression for all of the characters, Regexp::Assemble is really nice.

use v5.10.1;
use Regexp::Assemble;

my $ra = Regexp::Assemble->new;

my @specials = qw(_ $ { } # % & );

foreach my $char ( @specials ) {
    $ra->add( "\\Q$char\\E" );
    }

my $re = $ra->re;
say "Regex is $re"; 

while( <DATA> ) {
    s/($re)/"\\$1"/eg;
    print;
    }

__DATA__
There are $100 dollars
Part #1234
Outside { inside } Outside

Notice how, in the first line of input, Regexp::Assemble has re-arranged my pattern. It's not just the glued together bits of the parts I added:

Regex is (?^:(?:[#$%&_]|\{|\}))
There are \$100 dollars
Part \#1234
Outside \{ inside \} Outside

If you want to add more characters, you just put the character in @specials. Everything else happens for you.

Comments

1

$ has special meaning in regexp, namely "end of string". You would be better off with something like this:

# escape special characters, join them into a single line
my $chars = join '', map { "\\$_" } keys %special_characters;
$string =~ s/([$chars])/$special_characters{$1}/g;

Also, perl doesn't like "$" much, better use '$' (single quotes => no interpolation).

UPDATE: Sorry, I was writing this in a hurry => too many edits :(

1 Comment

Thanks for your feedback, your solution looks really fancy! However I am bound to use the easier code (teamwork)... Thanks for the heads up on the single quotes

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.