2

I'm working on a regex pattern that will match a pattern, or any subString of the pattern, occurring at the end of the string. While what I have below works and is fairly easy to understand, I'm sure there's gotta be a more elegant way to do this. I've looked into boundary matches and quantifiers but just don't see a great way to mash em up to get something like this. Any ideas?

regex = (_part_\d+$|_part_$|_part$|_par$|_pa$|_p$|_$)

aString_part_1  - match
aString_part_   - match
aString_part    - match
aString_par     - match
aString_pa      - match
aString_p       - match
aString_        - match
aString         - no match
6
  • What you have is fine, but you might factor out $, and I don't understand the pupose of the capture group. Another way, which I personally don't like as well, is _(?:p(?:a(?:r(?:t(?:_(?:\d)?)?)?)?))?$. Demo. Commented Apr 20, 2020 at 5:08
  • 3
    If you factor out (if I may use that expression), you are left with _(?:part_\d+|part_|part|par|pa|p)?$. Demo Commented Apr 20, 2020 at 5:22
  • @caryswoveland Is right. However this will explude _ from the front of the group. It will still be required, but not part of group 1. However since it is both fix and contained in the general match, you can leave it out. Modified my answer accordingly. Commented Apr 20, 2020 at 5:25
  • 1
    @TreffnonX, you can exclude, maybe even explode, _, but expluding it will never work! Commented Apr 20, 2020 at 5:30
  • 2
    @CarySwoveland Nice, but I have 2 comments: 1) In the first example, the innermost (?:\d)? was wrong, since you missed the +, so should be (?:\d+)?, which can be shortened to \d*. --- 2) In the second example, the first two options can be merged, i.e. part_\d+|part_part_\d*. --- In both cases, \d+\d* simplifies the regex. Commented Apr 20, 2020 at 6:32

1 Answer 1

1

Your matching criteria seems so wide-bound, that any of the following will match:

  • foobar_
  • _
  • _part_00000000
  • ______

If that is intended, then you can also write: regex = _(?:p(?:a(?:r(?:t(?:_\d*)?)?)?)?)?$ which is arguably not more elegant. If you leave out the silencers, then you can write: regex = _(p(a(r(t(_\d*)?)?)?)?)?$ which is actually a bit more readable, but a tiny bit less performant.

If you are willing to forfeit the completeness of the characters, then you can also write regex = _p?a?r?t?_?\d*$, but then you will suffer the concequence that the following would also work:

  • _pt5
  • _pat_69
  • __666

Edited the question according to a correct hint by Cary Swoveland.

Hint: the page regex101.com is extremely helpful with learning about regular expressions and debugging them. The java-flavor is not contained, but pcre is close enough in most cases.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.