0

For ex:

let a ='caaab'

I need the count of 'aa' occurrence, So when I write,

a.match(/aa/g)

it returned with ['aa'] which is wrong,

So any other expression for which I can get ['aa','aa'] ?

1
  • If you specifically want to search only for aa occurrences, you could try a.match(/(a)(?=\1)/g).map(ch => ch + ch). If you need to search for any double character, see my answer below. My answer also explains the regex in more detail. Commented Jan 28, 2021 at 8:02

2 Answers 2

2

I don't know if it possible to achieve it only by using regular expressions but you can write a function such as:

function countOccurencies(text, pattern) {
  let index = 0;
  let count = 0;
  let find;
  while((find = text.substr(index).search(pattern)) >= 0) {
    index += find + 1;
    count++;
  }
  return count;
}

// You can use it like this:
let occ;
occ = countOccurencies('caaab', /aa/g); // 2
occ = countOccurencies('caaab', 'aa');  // 2
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks DDomen, I was already having answer with Function, I wanted with Regular Expression.
I've found a solution to your specific problem with the regex lookahead assertion: a.match(/(?=(aa))/g) The actual result is an array of empty strings with length equals to the aa occurrencies @PavanBhat
0

You could use the following expression: /(.)(?=\1)/g

Short explanation:

  • (.) is a capture group; it captures any character except newlines
  • \1 is a backreference to the first capture group (the (.))
  • (?=\1) creates a positive lookahead for that backreference \1 (without consuming characters)
  • The g option at the end of the regex indicates that all matches should be returned instead of just the first one.

The final regex match result will contain all characters that are followed by that same character.

let regex = /(.)(?=\1)/g
let a1 = 'caaab'
let r1 = a1.match(regex) // [ 'a', 'a' ]
let a2 = 'caaabbcdd'
let r2 = a2.match(regex) // [ 'a', 'a', 'b', 'd' ]

You will not get doubled characters in your regex matches, only single characters. But they can be easily doubled by using the Array.prototype.map function, because you know for sure that they are always followed by that same character in the search string.

let regex = /(.)(?=\1)/g
let a1 = 'caaab'
let r1 = a1.match(regex).map(ch => ch + ch) // [ 'aa', 'aa' ]
let a2 = 'caaabbcdd'
let r2 = a2.match(regex).map(ch => ch + ch) // [ 'aa', 'aa', 'bb', 'dd' ]

Newlines

The (.) capture group does not match doubled newlines. If you also want to include doubled newlines, you could add them in the capture group like this: (.|\n). The resulting code would look like this:

let regex = /(.|\n)(?=\1)/g
let a = 'caaab\nbbc\n\nddd'
let r = a.match(regex).map(ch => ch + ch) // [ 'aa', 'aa', 'bb', '\n\n', 'dd', 'dd' ]

Note on performance and maintainability

Regular expressions can have serious performance penalties. Furthermore, they are difficult to read and understand. As my answer might provide a valid solution for your problem using regular expressions, I advise not to use it. Creating a small function that provides the same result might be a lot more performant and clearer to understand, which eases future maintenance.

function getDoubleChars(searchString, includeNewlines = false) {
  let result = []
  let previousChar = ''

  for (let char of searchString) {
    if (char === previousChar) {
      if (char !== '\n' || includeNewlines) {
        result.push(char + char)
      }
    }

    previousChar = char
  }

  return result;
}

let a = 'caaab\nbbc\n\nddd'
let r = getDoubleChars(a) // [ 'aa', 'aa', 'bb', 'dd', 'dd' ]
let r = getDoubleChars(a, true) // [ 'aa', 'aa', 'bb', '\n\n', 'dd', 'dd' ]

1 Comment

Thanks for explaining in terms of Performance

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.