0

I need to match (using regex) strings that can be like this:

required: custodian_{number 1 - 9}_{fieldType either txt or ssn} optional: _{fieldLength 1-999}

So for example: custodian_1_ssn_1 is valid custodian_1_ssn_1_255 is valid

custodian or custodian_ or custodian_1 or custodian_1_ or custodian_1_ssn or custodian_1_ssn_ or custodian_1_ssn_1_ are not valid

Currently I am working with this:

(?:custodian|signer)_[1-9]?[0-9]_(?:txt|ssn)_[1-9][0-9]?(_[1-9]?[0-9]?[0-9]?)?

as my regex and my api is working to pick up: custodian_1_txt_1 custodian_1_ssn_1 custodian_1_txt_1_255 <---- not matching the last "5"

any thoughts?

3
  • 1
    IMHO you shouldn't be using Regex for this. It would probably be easier and faster to use something like string.Split('_') and then iterate over the resulting array, checking for validity and required attributes. Commented Aug 8, 2018 at 17:25
  • 2
    Your best bet is to play with it in a RegEx tool: regex101.com/r/s9ilVe/1 Commented Aug 8, 2018 at 17:28
  • You say you want to match numbers 1-9 for the first field, but [1-9]?[0-9] would match 0-99. Why is the first digit optional, and why is there a second digit at all? Commented Aug 8, 2018 at 17:34

4 Answers 4

1

You may use pattern:

^custodian(?:_[a-z0-9]+)+$
  • ^ Assert position beginning of line.
  • custodian Match literal substring custodian.
  • (?:_[a-z0-9]+)+ Non capturing group. Multiple sequence of _ followed by alphanumerics.
  • $ Assert position end of line.

You can check the correct matches here.

Obviously you can modify the pattern to add substring signer in non capturing group as:

^(?:custodian|signer)(?:_[a-z0-9]+)+$.

Sign up to request clarification or add additional context in comments.

1 Comment

I chose this as the answer because it works in all cases I tried. Although, it seems it would works for something out of scope as well IE: custodian_1_xxx_1_15 where xxx is not an acceptable sub-string. It's Ok, though, I will document around this.
1

I suggest using \d for numbers not yours and this is my code try it:-

(?:custodian|signer)_[1-9]?[0-9]_(?:txt|ssn)_[1-9][0-9]?(_[1-9]?\d*)?

I just added a \d value to the end of your pattern to match all end digits before another match.

1 Comment

The end bit matches a string ending in _ (with no number). You can fix that by getting rid of the ? after the last [1-9].
1

You could use an anchor to assert the start ^ and the end $ of the string and for the last part make at least the first 1-9 not optional or else it would match and underscore at the end:

^(?:custodian|signer)_[1-9]?[0-9]_(?:txt|ssn)_[1-9][0-9]?(_[1-9][0-9]?[0-9]?)?$

Comments

1

If you're only interested in the last digits, this super generic regex will do:

(?:.+)_(\d+)

If you do need to match the whole string, this worked:

^(?:custodian|signer)_\d+_(?:txt|ssn)(?:_\d+)?_(\d+)$

4 Comments

@UnbearableLightness Well it doesn't match the whole string, but it does match some of it. Added anchors.
Yeah that's better, the last capturing group should be made non capturing though.
@UnbearableLightness OP's regex captures the last group.
Uh, apologies, my bad.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.