1

I have a string

CO12dadaCO2dafdCO345daaf

I want to extract all occurences of CO followed by some digits /CO(\d*)([\s\S]*)/, up to another CO.

In this case I want to get the output:

['CO12dada', 'CO2dafd', 'CO345daaf']

The above regex I tried also matches the rest of the CO's at once so it doesn't work.

I could get the index of a regex for the first match using str.search, but I need the indexes of a regex for all occurrences.

4
  • Like this? CO[a-z0-9]+ regex101.com/r/etgIMq/1 Commented Sep 14, 2020 at 8:25
  • I need to match the that followed by anything except another match of that. Commented Sep 14, 2020 at 8:26
  • 1
    Using a non greedy quantifier with a positive lookahead CO[\s\S]*?(?=CO|$) regex101.com/r/VRskly/1 Commented Sep 14, 2020 at 8:28
  • Using Javascript, you could also shorten it to CO[^]*?(?=CO|$) Commented Sep 14, 2020 at 8:43

4 Answers 4

1

const string = 'CO12dadaCO2dafdCO345daaf'
const result = string.match(/(CO.*?)(?=CO|$)/g)
console.log(result)

Sign up to request clarification or add additional context in comments.

1 Comment

I accepted this answer, because you mention the global option that solves the problem.
1

Just get your matches with .split():

console.log("CO12dadaCO2dafdCO345daaf".split(/(?!^)(?=CO)/))

Result:

[
  "CO12dada",
  "CO2dafd",
  "CO345daaf"
]

(?!^)(?=CO) = matches the empty string before CO substring, but not at the string start.

7 Comments

Cool I didn't know split took regex like that. Is that a positive lookahead, what does (?!^) do?
@eguneys (?!^) is a negative lookahead, please see this question about it
So split is equivalent of a match with a global regex?
This is confusing, because regex doesn't match the input, but makes marks somehow (what is that mean), and in case of split those marks are split. Can you please clarify your answer.
@eguneys This is out of scope, but you may want to match empty locations in a string to insert something there. "a1b2".replace(/(?=\d)/g, '-') returns a-1b-2. A must-read for you is "Lookahead and Lookbehind Zero-Length Assertions". Also, see Mastering Lookahead and Lookbehind.
|
0

Or this one:

CO\w+?(?=CO|$)

see demo here: https://regex101.com/r/gFZomh/1

Basically: a "non-greedy" matching of all "word characters" after "CO" followed by a lookahead demanding another "CO" or end-of-string.

If you also want to match "non-word characters", you could modify the regexp to

CO[\w\W]+?(?=CO|$)

This will also work on something like "CO12dadaCO2da,fdCO345daaf" to produce the matches: ["CO12dada","CO2da,fd","CO345daaf"].

Comments

0

Using Javascript, you can use

CO[^]*?(?=CO|$)
  • CO[^]*? Match CO, then any char including newlines as least as possible
  • (?=CO|$) Positive lookahead, assert what is on the right is either CO or the end of the string

REgex demo

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.