1

I have a solve a complex problem: I want to filter out substrings from a possible long text. There are certain keywords that indicate a substring. Only if a keyword is preceeded by at least one character, which is not a white space or a different keyword, should match. Then also every character they keyword is preceeded by should be included in the match. I want to use a regex expression in JavaScript for this.

My keywords are: ":yellow:", ":black:", ":green:", ":blue:" ":red:"

For example I have a Text like this: " :green: aba :red: gd efg:blue: :yellow: sdg:red: sea gea e :black: "

Now I want to use match() on this string with a re that gives me these matches: " aba :red:", "gd efg:blue":, "sdg:red:", sea gea e :black:

:green: at the start should not be matched, because it is not preceded by a character. :yellow: should also not be matched, because it is preceded by a different keyword (in this case :blue:)

I have tried to use negative lookahead expressions (like (?!)) to prevent matching when keywords preceed other keywords. But it didn't quite give me the results I am looking for.

    /((?!(:yellow:|:black:|:green:|:blue:|:red:))\S+\s*)+(:yellow:|:black:|:green:|:blue:|:red:)/g
    
    let ar1 = text1.match(re1);
    
    console.log(ar1);

this is my output: [ 'green: aba :red:', 'gd efg:blue: :yellow:', 'sdg:red: sea gea e :black:' ]

but i want this:

[ ' aba :red:', 'gd efg:blue: ', 'sdg:red:', 'sea gea e :black:' ]

1 Answer 1

3

You could shorten your alternation by placing the : outside of it and instead of matching \S+ you could match not a whitespace char or a : using a negated character class.

To match the multiple "words" you could repeat matching a space and use the negated character class again.

(?!:(?:yellow|black|green|blue|red):)[^\s:]+(?: [^\s:]+)*\s*:(?:yellow|black|green|blue|red):

Explanation

  • (?! Negative lookahead, assert what is directly on the right is not
    • :(?:yellow|black|green|blue|red): Match any of the listed between :
  • ) Close negative lookahead
  • [^\s:]+ Match 1+ times not a whitespace char or :
  • (?: [^\s:]+)* Repeat 0+ times matching a space, then 1+ times not a whitespace char or :
  • \s* Match 0+ whitespace chars
  • :(?:yellow|black|green|blue|red): Match any of the listed between :

Regex demo

const regex = /(?!:(?:yellow|black|green|blue|red):|\s)[^\s:]+(?: [^\s:]+)*\s*:(?:yellow|black|green|blue|red):/g;
const str = ` :green: aba :red: gd efg:blue: :yellow: sdg:red: sea gea e :black: `;
console.log(str.match(regex));

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.