1

I am using python tool which checks git log commit messages to find out if feature with given ID was introduced or reverted. I cannot change the code of the tool. Can only provide proper regex as an input. Input looks like this:

input_regexes = {
    "add_pattern": r".*\[\s*(ID\d{3})\s*\](.*)"
    "revert_pattern": r"[Rr]evert.*\[\s*(ID\d{3})\s*\](.*)"
}

First capture group is used to get feature ID and second is used as a feature description. The problem is, when string with [Rr]evert appears, then both patterns match. What I would like to achieve is:

  • revert_pattern pattern matches only commit messages which contain ID in brackets and preceding [Rr]evert
  • add_pattern pattern matches only commit messages which contain ID in brackets and do not contain preceding [Rr]evert

In following example revert_pattern should match only revert_feature_message and add_pattern should match only strings available in add_feature_messages:

revert_feature_message='Revert "[ID123] some cool feature."'
add_feature_messages=[
  '[ID123] some cool feature.',
  'some prefix [ID123] some cool feature'
]

I tried using:

(?<!Revert).*?\[\s*(ID\d{3})\s*\](.*)

as add_pattern but it didn't workout. Could you help make it correct?

1 Answer 1

1

The revert pattern [Rr]evert.*\[\s*(ID\d{3})\s*\](.*) already matches only the revert_feature_message

To match only the strings in add_feature_messages you can assert that the string does not contain revert or Revert.

^(?!.*[Rr]evert).*\[\s*(ID\d{3})\s*\](.*)

Regex demo

Or a bit more specific:

^(?!.*[Rr]evert [^][]*\[\s*ID\d{3}\s*]).*\[\s*(ID\d{3})\s*\](.*)

Regex demo

If Revert is at the start of the string, you can omit the leading .*

Sign up to request clarification or add additional context in comments.

2 Comments

wow, man, you actually did it! Could you please answer additional questions? 1. how did you manage to work "lookahead" as "lookbehind"? :D I thought "lookahead" construct is made for checking suffixes of the pattern 2. how is that possible "lookbehind" has to have fixed width and greedy dot is ok in "lookahead" construct ?
@marek_maras Glad it worked for you :-) The negative lookahead at the start asserts that the string does not contain first matching revert and after that the ID from the start of the string ^ So before the matching starts, this assertion runs once to verify this "rule". About the second question, using re does not support an infinite quantifier in the lookbehind. For that to work, you can use the PyPi regex module

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.