JS string replace only replacing every other occurence

Question

I have the following JS:

"a a a a".replace(/(^|\s)a(\s|$)/g, '$1')

I expect the result to be '', but am instead getting 'a a'. Can anyone explain to me what I am doing wrong?

Clarification: What I am trying to do is remove all occurrences of 'a' that are surronded by whitespace (i.e. a whole token)

@hwnd No, since that will also match 'a-a'

Baruch
– Baruch

2014-12-23 21:52:14 +00:00
Commented Dec 23, 2014 at 21:52 — Baruch
– Baruch, Commented Dec 23, 2014 at 21:52
How about 'a a a a'.replace(/(?:^|\s)a(?=\s|$)/g, '');?

rojo
– rojo

2014-12-24 01:14:02 +00:00
Commented Dec 24, 2014 at 1:14 — rojo
– rojo, Commented Dec 24, 2014 at 1:14

Hacketo · Accepted Answer · 2014-12-23 22:48:09Z

It's because this regex /(^|\s)a(\s|$)/g match the previous char and the next char to each a

in string "a a a a" the regex matches :

"a " , then the string to check become "a a a"$ (but now the start of the string is not the beginning and there is not space before)
" a " (the third a) , then become "a"$ (that not match because no space before)

Edit: Little bit tricky but working (without regex):

var a = "a a a a";

// Handle beginning case 'a '
var startI = a.indexOf("a ");
if (startI === 0){
    var off = a.charAt(startI + 2) !== "a" ? 2 : 1; // test if "a" come next to keep the space before
    a = a.slice(startI + off);
}

// Handle middle case ' a '
var iOf = -1;
while ((iOf = a.indexOf(" a ")) > -1){
    var off = a.charAt(iOf + 3) !== "a" ? 3 : 2; // same here
    a = a.slice(0, iOf) + a.slice(iOf+off, a.length);
}

// Handle end case ' a'
var endI = a.indexOf(" a");
if (endI === a.length - 2){
    a = a.slice(0, endI);
}

a; // ""

Johan Karlsson · Accepted Answer · 2014-12-23 23:15:59Z

2

First "a " matches. Then it will try to match against "a a a", which will skip first a, and then match "a ". Then it will try to match against "a", which will not match.

First match will be replaced to beginning of line. => "^"
Then we have "a" that didn't match => "a"
Second match will be replaced to " " => " "
Then we have "a" that didn't match => "a"

The result will be "a a".

To get your desired result you can do this:

"a a a a".replace(/(?:\s+a(?=\s))+\s+|^a\s+(?=[^a]|$|a\S)|^a|\s*a$/g, '')

edited Dec 23, 2014 at 23:15

answered Dec 23, 2014 at 21:42

Johan Karlsson

6,4761 gold badge22 silver badges28 bronze badges

4 Comments

Baruch Over a year ago

Why does the second "a" get skipped? Also, your solution is not correct, as it changes 'ba a' to 'b'

Bret Copeland Over a year ago

@baruch run console.log('"' + "ba a".replace(/(^|\s+)a(?=\s|$)/g, '') + '"'); and you'll see that it does, in fact output "ba".

Hacketo Over a year ago

this does not remove all spaces around 'a', try with some other text around 'a' like this : console.log('"' + "a a a a fdsds a".replace(/(^|\s+)a(?=\s|$)/g, '') + '"');

Johan Karlsson Over a year ago

@baruch I've updated my regex now, check it again. The sceond a get skipped because it doesn't match your regex, the second time it matches against "a a a"$ (without beginning of string)

Bret Copeland · Accepted Answer · 2014-12-23 22:40:41Z

As others have tried to point out, the issue is that the regex consumes the surrounding spaces as part of the match. Here's a [hopefully] more straight forward explanation of why that regex doesn't work as you expect:

First let's breakdown the regex, it says match the a space or start of string, followed by an 'a' followed by a space or the end of the string.

Now let's apply it to the string. I've added character indexes beneath the string to make things easier to talk about:

a a a a
0123456

The regex looks at the 0 index char, and finds an 'a' at that location, followed by a space at index 2. This is a match because it is the start of the string, followed by an a followed by a space. The length of our match is 2 (the 'a' and the space), so we consume two characters and start our next search at index 2.

Character 2 ('a') is neither a space nor the start of the string, and therefore it doesn't match the start of our regular expression, so we consume that character (without replacing it) and move on to the next.

Character 3 is a space, followed by an 'a' followed by another space, which is a match for our regex. We replace it with an empty string, consume the length of the match (3 characters - " a ") and move on to index 6.

Character 6 ('a') is neither a space nor the start of the string, and therefore it doesn't match the start of our regular expression, so we consume that character (without replacing it) and move on to the next.

Now we're at the end of the string, so we're done.

The reason why the regex @caeth suggested (/(^|\s+)a(?=\s|$)/g) works is because of the ?= quantifier. From the MDN Regexp Documentation:

Matches x only if x is followed by y. For example, /Jack(?=Sprat)/ matches "Jack" only if it is followed by "Sprat". /Jack(?=Sprat|Frost)/ matches "Jack" only if it is followed by "Sprat" or "Frost". However, neither "Sprat" nor "Frost" is part of the match results.

So, in this case, the ?= quantifier checks to see if the following character is a space, without actually consuming that character.

Accepted for best explanation. The regex I used in the end is /(?:^|\s)a(?=\s|$)/g, replace by ''

vks · Accepted Answer · 2014-12-24 04:53:40Z

1

(^|\s)a(?=\s|$)

Try this.Replace by $1.See demo.

https://regex101.com/r/gQ3kS4/3

answered Dec 24, 2014 at 4:53

vks

68.1k11 gold badges96 silver badges132 bronze badges

Comments

jparaya · Accepted Answer · 2014-12-23 21:35:42Z

0

Use this instead:

"a a a a".replace(/(^|\s*)a(\s|$)/g, '$1')

With "* this you replace all the "a" occurrences

Greetings

answered Dec 23, 2014 at 21:35

jparaya

1,33911 silver badges15 bronze badges

Comments

nhahtdh · Accepted Answer · 2014-12-24 05:00:04Z

0

Or you can just split the string up, filter it and glue it back:

"a ba sl lf a df a a df r a".split(/\s+/).filter(function (x) { return x != "a" }).join(" ")
>>> "ba sl lf df df r"

"a a a a".split(/\s+/).filter(function (x) { return x != "a" }).join(" ")
>>> ""

Or in ECMAScript 6:

"a ba sl lf a df a a df r a".split(/\s+/).filter(x => x != "a").join(" ")
>>> "ba sl lf df df r"

"a a a a".split(/\s+/).filter(x => x != "a").join(" ")
>>> ""

I assume that there is no leading and trailing spaces. You can change the filter to x && x != 'a' if you want to remove the assumption.

answered Dec 24, 2014 at 5:00

nhahtdh

56.9k15 gold badges131 silver badges164 bronze badges

Collectives™ on Stack Overflow

JS string replace only replacing every other occurence

6 Answers 6

Comments

4 Comments

1 Comment

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

6 Answers 6

Comments

4 Comments

1 Comment

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related