245

Is there a way to retrieve the (starting) character positions inside a string of the results of a regex match() in Javascript?

13 Answers 13

321

exec returns an object with a index property:

var match = /bar/.exec("foobar");
if (match) {
    console.log("match found at " + match.index);
}

And for multiple matches:

var re = /bar/g,
    str = "foobarfoobar";
while ((match = re.exec(str)) != null) {
    console.log("match found at " + match.index);
}

Sign up to request clarification or add additional context in comments.

19 Comments

Thanks for your help! Can you tell me also how do I find the indexes of multiple matches?
Note: using the re as a variable, and adding the g modifier are both crucial! Otherwise you will get an endless loop.
@OnurYıldırım - here's a jsfiddle of it working...I've tested it all the way back to IE5...works great: jsfiddle.net/6uwn1vof
@JimboJonny, hm well I learned something new. My test case returns undefined. jsfiddle.net/6uwn1vof/2 which is not a search-like example like yours.
@OnurYıldırım - Remove the g flag and it'll work. Since match is a function of the string, not the regex it cannot be stateful like exec, so it only treats it like exec (i.e. has an index property) if you're not looking for a global match...because then statefulness doesn't matter.
|
81

Here's what I came up with:

// Finds starting and ending positions of quoted text
// in double or single quotes with escape char support like \" \'
var str = "this is a \"quoted\" string as you can 'read'";

var patt = /'((?:\\.|[^'])*)'|"((?:\\.|[^"])*)"/igm;

while (match = patt.exec(str)) {
  console.log(match.index + ' ' + patt.lastIndex);
}

5 Comments

match.index + match[0].length also works for the end position.
@BeniCherniavsky-Paskin, wouldn't the end position be match.index + match[0].length - 1?
@David, I meant exclusive end position, as taken e.g. by .slice() and .substring(). Inclusive end would be 1 less as you say. (Be careful that inclusive usually means index of last char inside match, unless it's an empty match where it's 1 before match and might be -1 outside the string entirely for empty match at start...)
for patt = /.*/ it goes infinity loop how can we restrict that?
RegExp has to end with 'g' or 'y' flag for 'lastIndex' to be set. If it doesn't work, either your browser is broken or (more likely) you're doing something wrong. See MDN for detailed explanation of this property: developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/…
37

In modern browsers, you can accomplish this with string.matchAll().

The benefit to this approach vs RegExp.exec() is that it does not rely on the regex being stateful, as in @Gumbo's answer.

let regexp = /bar/g;
let str = 'foobarfoobar';

let matches = [...str.matchAll(regexp)];
matches.forEach((match) => {
    console.log("match found at " + match.index);
});

3 Comments

I had luck using this single-line solution based on matchAll ``` let regexp = /bar/g; let str = 'foobarfoobar'; let matchIndices = Array.from(str.matchAll(regexp)).map(x => x.index); console.log(matchIndices)```
not sure why you say this approach does not rely on the regex being stateful. I try your code without g flag and get error
The "g" flag means "global search", i.e. match all occurrences in the string. It doesn't make sense to use str.matchAll() if you're not doing a global search. Hopefully that helps, but I'm not sure what you're trying to do. With my "stateful" comment, I mean that you don't have to use a "while" loop and rely on the internal state of the Regexp object, as in Gumbo's answer, which I linked. Good luck!
28

From developer.mozilla.org docs on the String .match() method:

The returned Array has an extra input property, which contains the original string that was parsed. In addition, it has an index property, which represents the zero-based index of the match in the string.

When dealing with a non-global regex (i.e., no g flag on your regex), the value returned by .match() has an index property...all you have to do is access it.

var index = str.match(/regex/).index;

Here is an example showing it working as well:

var str = 'my string here';

var index = str.match(/here/).index;

console.log(index); // <- 10

I have successfully tested this all the way back to IE5.

4 Comments

This returns an array, not an object with index on it
@BenTaliadoros I'm afraid you're wrong, it is both and array and an object with index property (see the answer)
Seems so! Not sure what I was thinking all those years ago
Note that if you do str.match(/here/g), with the global flag, match.index will be undefined.
12

You can use the search method of the String object. This will only work for the first match, but will otherwise do what you describe. For example:

"How are you?".search(/are/);
// 4

Comments

7

Here is a cool feature I discovered recently, I tried this on the console and it seems to work:

var text = "border-bottom-left-radius";

var newText = text.replace(/-/g,function(match, index){
    return " " + index + " ";
});

Which returned: "border 6 bottom 13 left 18 radius"

So this seems to be what you are looking for.

1 Comment

just beware that replacement functions add capture groups as well, so note that it's always the second-to-last entry in the replacement function arguments that is the position. Not "the second argument". The function arguments are "full match, group1, group2, ...., index of match, full string matched against"
4

I'm afraid the previous answers (based on exec) don't seem to work in case your regex matches width 0. For instance (Note: /\b/g is the regex that should find all word boundaries) :

var re = /\b/g,
    str = "hello world";
var guard = 10;
while ((match = re.exec(str)) != null) {
    console.log("match found at " + match.index);
    if (guard-- < 0) {
      console.error("Infinite loop detected")
      break;
    }
}

One can try to fix this by having the regex match at least 1 character, but this is far from ideal (and means you have to manually add the index at the end of the string)

var re = /\b./g,
    str = "hello world";
var guard = 10;
while ((match = re.exec(str)) != null) {
    console.log("match found at " + match.index);
    if (guard-- < 0) {
      console.error("Infinite loop detected")
      break;
    }
}

A better solution (which does only work on newer browsers / needs polyfills on older/IE versions) is to use String.prototype.matchAll()

var re = /\b/g,
    str = "hello world";
console.log(Array.from(str.matchAll(re)).map(match => match.index))

Explanation:

String.prototype.matchAll() expects a global regex (one with g of global flag set). It then returns an iterator. In order to loop over and map() the iterator, it has to be turned into an array (which is exactly what Array.from() does). Like the result of RegExp.prototype.exec(), the resulting elements have an .index field according to the specification.

See the String.prototype.matchAll() and the Array.from() MDN pages for browser support and polyfill options.


Edit: digging a little deeper in search for a solution supported on all browsers

The problem with RegExp.prototype.exec() is that it updates the lastIndex pointer on the regex, and next time starts searching from the previously found lastIndex.

var re = /l/g,
str = "hello world";
console.log(re.lastIndex)
re.exec(str)
console.log(re.lastIndex)
re.exec(str)
console.log(re.lastIndex)
re.exec(str)
console.log(re.lastIndex)

This works great as long as the regex match actually has a width. If using a 0 width regex, this pointer does not increase, and so you get your infinite loop (note: /(?=l)/g is a lookahead for l -- it matches the 0-width string before an l. So it correctly goes to index 2 on the first call of exec(), and then stays there:

var re = /(?=l)/g,
str = "hello world";
console.log(re.lastIndex)
re.exec(str)
console.log(re.lastIndex)
re.exec(str)
console.log(re.lastIndex)
re.exec(str)
console.log(re.lastIndex)

The solution (that is less nice than matchAll(), but should work on all browsers) therefore is to manually increase the lastIndex if the match width is 0 (which may be checked in different ways)

var re = /\b/g,
    str = "hello world";
while ((match = re.exec(str)) != null) {
    console.log("match found at " + match.index);

    // alternative: if (match.index == re.lastIndex) {
    if (match[0].length == 0) {
      // we need to increase lastIndex -- this location was already matched,
      // we don't want to match it again (and get into an infinite loop)
      re.lastIndex++
    }
}

Comments

4

I had luck using this single-line solution based on matchAll (my use case needs an array of string positions)

let regexp = /bar/g;
let str = 'foobarfoobar';

let matchIndices = Array.from(str.matchAll(regexp)).map(x => x.index);

console.log(matchIndices)

output: [3, 9]

1 Comment

love it, just replaced a while/exec combo with your one-liner 🥳 thank you
2

This member fn returns an array of 0-based positions, if any, of the input word inside the String object

String.prototype.matching_positions = function( _word, _case_sensitive, _whole_words, _multiline )
{
   /*besides '_word' param, others are flags (0|1)*/
   var _match_pattern = "g"+(_case_sensitive?"i":"")+(_multiline?"m":"") ;
   var _bound = _whole_words ? "\\b" : "" ;
   var _re = new RegExp( _bound+_word+_bound, _match_pattern );
   var _pos = [], _chunk, _index = 0 ;

   while( true )
   {
      _chunk = _re.exec( this ) ;
      if ( _chunk == null ) break ;
      _pos.push( _chunk['index'] ) ;
      _re.lastIndex = _chunk['index']+1 ;
   }

   return _pos ;
}

Now try

var _sentence = "What do doers want ? What do doers need ?" ;
var _word = "do" ;
console.log( _sentence.matching_positions( _word, 1, 0, 0 ) );
console.log( _sentence.matching_positions( _word, 1, 1, 0 ) );

You can also input regular expressions:

var _second = "z^2+2z-1" ;
console.log( _second.matching_positions( "[0-9]\z+", 0, 0, 0 ) );

Here one gets the position index of linear term.

Comments

2
var str = "The rain in SPAIN stays mainly in the plain";

function searchIndex(str, searchValue, isCaseSensitive) {
  var modifiers = isCaseSensitive ? 'gi' : 'g';
  var regExpValue = new RegExp(searchValue, modifiers);
  var matches = [];
  var startIndex = 0;
  var arr = str.match(regExpValue);

  [].forEach.call(arr, function(element) {
    startIndex = str.indexOf(element, startIndex);
    matches.push(startIndex++);
  });

  return matches;
}

console.log(searchIndex(str, 'ain', true));

2 Comments

This is incorrect. str.indexOf here just finds the next occurrence of the text captured by the match, which is not necessarily the match. JS regex supports conditions on text outside of the capture with lookahead. For instance searchIndex("foobarfoobaz", "foo(?=baz)", true) should give [6], not [0].
why ` [].forEach.call(arr, function(element)` why not arr.forEach or arr.map
0
function trimRegex(str, regex){
    return str.substr(str.match(regex).index).split('').reverse().join('').substr(str.match(regex).index).split('').reverse().join('');
}

let test = '||ab||cd||';
trimRegex(test, /[^|]/);
console.log(test); //output: ab||cd

or

function trimChar(str, trim, req){
    let regex = new RegExp('[^'+trim+']');
    return str.substr(str.match(regex).index).split('').reverse().join('').substr(str.match(regex).index).split('').reverse().join('');
}

let test = '||ab||cd||';
trimChar(test, '|');
console.log(test); //output: ab||cd

Comments

0

Use Regex d flag and indices property

let str = 'ab1c de fgh23 ij klmn456';
for (let match of str.matchAll (/[a-z]+(\d+)/dg))
  console.log (JSON.stringify (match),
    JSON.stringify (match.indices));

/* Output (formatted for better readability):

'["ab1","1"]'       "[[0,3],[2,3]]"
'["fgh23","23"]'    "[[8,13],[11,13]]"
'["klmn456","456"]' "[[17,24],[21,24]]"
*/

Note: Tried to insert the code as a "Stack Snipet", it's runninng in the Editor but not here, in the answer. Removing the "Snippet" from the answer is also a problem, i had to discard the answer and paste the answer again.

Comments

-1

var str = 'my string here';

var index = str.match(/hre/).index;

alert(index); // <- 10

1 Comment

So just like in this answer from 4 years ago (which, unlike yours, works)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.