1

As seen here, the keyword Social Media was already matched so Social Media Wall is not being matched, Although it's different with the case of ERP as ERP and ERP Corecon both are matched.

Input Text -

hey, G Suite , ,Social Media Wall, Social Media Wall , ERP ERP Corecon

Query -

((?<=[,-./\s])(ERP Corecon|Social Media|Social Media Wall|ROI|ERP)\b)

Output -

hey, G Suite , ,**Social Media** Wall, **Social Media** Wall , **ERP** **ERP Corecon**

Ideal Output - Social Media & Social Media Wall both should be matched

regex link - https://regexr.com/6ugr5

7
  • 3
    Put the longer match before the shorter one in the query, like you did with ERP Corecon and ERP? Commented Sep 21, 2022 at 17:03
  • 1
    regexr.com/6ugr5 @anubhava Commented Sep 21, 2022 at 17:03
  • @JoachimIsaksson yes that can be done, but my regex pattern is auto-generated and I don't have control over that. Instead, I would prefer a regex pattern that can accommodate this. Also, I am trying to understand the reason why is Social Media Wall not being matched Commented Sep 21, 2022 at 17:04
  • When there are multiple matching alternatives, the result depends on the regexp engine. Some always prefer the longer match, some prefer the first match in the alternatives. I don't think there's a way to force one to act like the other. Commented Sep 21, 2022 at 17:08
  • 1
    @anubhava added input text to question - hey, G Suite , ,Social Media Wall, Social Media Wall , ERP ERP Corecon Commented Sep 21, 2022 at 17:17

3 Answers 3

1

You may be able to use this regex:

(?<=[,-./\s])(ERP(?: Corecon)?|Social Media(?: Wall)?|ROI)\b

RegEx Demo

RegEx Breakup:

  • (?<=[,-./\s]): Lookbehind to assert presence of these chars before current position
  • (; Start capture group
    • ERP(?: Corecon)?: Match ERP or ERP Corecon
    • |: OR
    • Social Media(?: Wall)?: Match Social Media or Social Media Wall
    • |: OR
    • ROI: Match ROI
  • ): End capture group
  • \b: Word boundary
Sign up to request clarification or add additional context in comments.

Comments

0

You could try to use named groups:

(?<SocialMediaWall>(?<SocialMedia>Social Media)( Wall)?)|(?<ERPCore>(?<ERP>ERP)( Corecon)?)|(?<ROI>ROI)

As you can see, the SocialMedia group is nested inside the SocialMediaWall group. That's a way of getting both values.

Comments

0

All your solutions were intriguing but I end up sorting my regex query into descending where the longer keywords came first, thus never allowing a substring to come before a string. This solved my problem.

1 Comment

Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.