3

I have the following JS:

"a a a a".replace(/(^|\s)a(\s|$)/g, '$1')

I expect the result to be '', but am instead getting 'a a'. Can anyone explain to me what I am doing wrong?

Clarification: What I am trying to do is remove all occurrences of 'a' that are surronded by whitespace (i.e. a whole token)

2
  • @hwnd No, since that will also match 'a-a' Commented Dec 23, 2014 at 21:52
  • How about 'a a a a'.replace(/(?:^|\s)a(?=\s|$)/g, '');? Commented Dec 24, 2014 at 1:14

6 Answers 6

2

It's because this regex /(^|\s)a(\s|$)/g match the previous char and the next char to each a

in string "a a a a" the regex matches :

  • "a " , then the string to check become "a a a"$ (but now the start of the string is not the beginning and there is not space before)
  • " a " (the third a) , then become "a"$ (that not match because no space before)

Edit: Little bit tricky but working (without regex):

var a = "a a a a";

// Handle beginning case 'a '
var startI = a.indexOf("a ");
if (startI === 0){
    var off = a.charAt(startI + 2) !== "a" ? 2 : 1; // test if "a" come next to keep the space before
    a = a.slice(startI + off);
}

// Handle middle case ' a '
var iOf = -1;
while ((iOf = a.indexOf(" a ")) > -1){
    var off = a.charAt(iOf + 3) !== "a" ? 3 : 2; // same here
    a = a.slice(0, iOf) + a.slice(iOf+off, a.length);
}

// Handle end case ' a'
var endI = a.indexOf(" a");
if (endI === a.length - 2){
    a = a.slice(0, endI);
}

a; // ""
Sign up to request clarification or add additional context in comments.

Comments

2

First "a " matches. Then it will try to match against "a a a", which will skip first a, and then match "a ". Then it will try to match against "a", which will not match.

  1. First match will be replaced to beginning of line. => "^"
  2. Then we have "a" that didn't match => "a"
  3. Second match will be replaced to " " => " "
  4. Then we have "a" that didn't match => "a"

The result will be "a a".

To get your desired result you can do this:

"a a a a".replace(/(?:\s+a(?=\s))+\s+|^a\s+(?=[^a]|$|a\S)|^a|\s*a$/g, '')

4 Comments

Why does the second "a" get skipped? Also, your solution is not correct, as it changes 'ba a' to 'b'
@baruch run console.log('"' + "ba a".replace(/(^|\s+)a(?=\s|$)/g, '') + '"'); and you'll see that it does, in fact output "ba".
this does not remove all spaces around 'a', try with some other text around 'a' like this : console.log('"' + "a a a a fdsds a".replace(/(^|\s+)a(?=\s|$)/g, '') + '"');
@baruch I've updated my regex now, check it again. The sceond a get skipped because it doesn't match your regex, the second time it matches against "a a a"$ (without beginning of string)
1

As others have tried to point out, the issue is that the regex consumes the surrounding spaces as part of the match. Here's a [hopefully] more straight forward explanation of why that regex doesn't work as you expect:

First let's breakdown the regex, it says match the a space or start of string, followed by an 'a' followed by a space or the end of the string.

Now let's apply it to the string. I've added character indexes beneath the string to make things easier to talk about:

a a a a
0123456

The regex looks at the 0 index char, and finds an 'a' at that location, followed by a space at index 2. This is a match because it is the start of the string, followed by an a followed by a space. The length of our match is 2 (the 'a' and the space), so we consume two characters and start our next search at index 2.

Character 2 ('a') is neither a space nor the start of the string, and therefore it doesn't match the start of our regular expression, so we consume that character (without replacing it) and move on to the next.

Character 3 is a space, followed by an 'a' followed by another space, which is a match for our regex. We replace it with an empty string, consume the length of the match (3 characters - " a ") and move on to index 6.

Character 6 ('a') is neither a space nor the start of the string, and therefore it doesn't match the start of our regular expression, so we consume that character (without replacing it) and move on to the next.

Now we're at the end of the string, so we're done.

The reason why the regex @caeth suggested (/(^|\s+)a(?=\s|$)/g) works is because of the ?= quantifier. From the MDN Regexp Documentation:

Matches x only if x is followed by y. For example, /Jack(?=Sprat)/ matches "Jack" only if it is followed by "Sprat". /Jack(?=Sprat|Frost)/ matches "Jack" only if it is followed by "Sprat" or "Frost". However, neither "Sprat" nor "Frost" is part of the match results.

So, in this case, the ?= quantifier checks to see if the following character is a space, without actually consuming that character.

1 Comment

Accepted for best explanation. The regex I used in the end is /(?:^|\s)a(?=\s|$)/g, replace by ''
1
(^|\s)a(?=\s|$)

Try this.Replace by $1.See demo.

https://regex101.com/r/gQ3kS4/3

Comments

0

Use this instead:

"a a a a".replace(/(^|\s*)a(\s|$)/g, '$1')

With "* this you replace all the "a" occurrences

Greetings

Comments

0

Or you can just split the string up, filter it and glue it back:

"a ba sl lf a df a a df r a".split(/\s+/).filter(function (x) { return x != "a" }).join(" ")
>>> "ba sl lf df df r"

"a a a a".split(/\s+/).filter(function (x) { return x != "a" }).join(" ")
>>> ""

Or in ECMAScript 6:

"a ba sl lf a df a a df r a".split(/\s+/).filter(x => x != "a").join(" ")
>>> "ba sl lf df df r"

"a a a a".split(/\s+/).filter(x => x != "a").join(" ")
>>> ""

I assume that there is no leading and trailing spaces. You can change the filter to x && x != 'a' if you want to remove the assumption.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.