0

Here is an regular expression from "Javascript:The Good Parts" book

//Make a regular expression object that matches a javascript string.
var my_regexp = new RegExp("\"(?:\\\\.|[^\\\\\\\"])*\"", 'g');

What this [^\\\\\\\"] expression is matching here?

3
  • What is the question? Do we have to solve it now? :/ Commented Aug 8, 2013 at 10:31
  • @ItiTyagi given exp is an regex that matches a javascript string. Just want to know what this [^\\\\\\\"] means in given expression Commented Aug 8, 2013 at 10:34
  • actually there are extra backslashes in it for special chars the [^\\\\\\\"] is in real means not a \ or " Commented Aug 8, 2013 at 10:41

2 Answers 2

3

In JavaScript, strings are surrounded by " (or ', which this regex doesn't support) and \ is used to escape characters that would otherwise have a different meaning.

Now, [^\\\\\\\"] is a character class for characters that aren't \ or ". However because we're using a string literal to define the regular expression the " needs escaping, and because \ has a special meaning within both strings and regular expressions we need to escape them too.

\"        starting characters
\\"       escape `\` for regex
\\\"      escape `"` for regex
\\\\\\"   escape `\` for string
\\\\\\\"  escape `"` for string

It's simpler if you use ' for the string, or a regex literal. The following are all the same.

new RegExp("\"(?:\\.|[^\\\\\\\"])*\"", "g");
new RegExp('"(?:\\.|[^\\\\\\"])*"', 'g');
/"(?:\.|[^\\\"])*"/g

In fact, " doesn't have a special meaning in a regular expression, so escaping it was not necessary.

/"(?:\.|[^\\"])*"/g

Also note that . isn't either \ or ", so the | construct is pointless. I would guess this is an error, and that it's intended to be \\. - i.e. a \ followed by any character. That would require four \ in the original, not two. Without this correction, the expression won't match strings like "ab\\c".

If we want to support ' as well then things are going to get very complicated, and we probably should just use a simple char-by-char parser, rather than a regular expression.

RegExp Reference

Sign up to request clarification or add additional context in comments.

Comments

1

Unwrapping var my_regexp = new RegExp("\"(?:\\.|[^\\\\\\\"])*\"",'g');:

1: new RegExp("\"(?:\\.|[^\\\\\\\"])*\"",'g');
2: /"(?:\.|[^\\\"])*"/g
               ^--- this backslash is not really needed, but does not hurt

Matches ", followed by any number of . or not \ and not ", followed by ". Also, since the group has (?:...) - it will not actually capture anything, it will just check that such pattern exists.

For example, in a string I "li.ke" donuts. I "h\ate" potatoes. it will match "li.ke", but will not match "h\ate" because of \.

3 Comments

note the the poster copypasted the expression incorrectly, should be \\\\..
@RobinsGupta: in my copy of the book there are four slashes for the string version. Two slashes don't make sense.
@thg435 thanks for the edit. Actually I was having old version of the book.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.