0

I've been writing regex for long time and covering below scenarios by writing two regexes, since I do not know if there's a way to handle it by a single regex. So, I would like to hear if there's a way to write a single regex to capture the both at one shot.

Suppose that we have a standard starting with A and ending with Z, the field delimiter is a pipe | and each field consist of components delimited by a hat ^.

  • Input1: A|1|1^^3^4^5|loongText|Z
  • Input2: A|13|^2^|loongText|Z

The regex should give below output

  • Output1 : captured groups 1,,3,4,5
  • Output2 : captured groups ,2,,,

My attempt : A\|.\d*\|(.*)\^(.*)\^(.*)\^(.*)\^(.*?)\|.+?\|Z works for the first input but not the second.

What regex matches both inputs and gets the groups in correct order ?

[UPDATE] Group order is important. So group 1 should be 1, group 2 should be returning an empty and 2 in respectively for input 1 and input 2. Because based on the order they have different meanings in the standard.

  • Input3: A|13|1^2^3|loongText|Z
  • Expected output: {"group1" :1, "group2": 2, "group3": 3}, so having captures in the right group is also important.
10
  • 1
    If it were me, I'd extract the third field using a regex, then split it using split instead of trying to do something clever with captured groups. Commented Sep 18, 2022 at 15:49
  • Do you need the empty matches? Perhaps like this (?<=(?:^|\|)(?=\d*\^)[\d^]{0,100})\d+ regex101.com/r/mceSCT/1 Or using \G with a capture group like (?:(?:^|\|)(?=\d*\^)|\G(?!\A))\^*(\d+)\d* regex101.com/r/AEbupx/1 Commented Sep 18, 2022 at 16:07
  • @RealSkeptic this what I'm doing right now already. If the field is not complex split otherwise another regex. Commented Sep 18, 2022 at 16:14
  • And how can it become complex? Why another regex? Why aren't you showing us your actual solution? Commented Sep 18, 2022 at 16:40
  • 1
    @MikeM hope you don't mind I answer to it on behalf of you : ) But, even the split function take a regex input. It's inevitable. Commented Sep 18, 2022 at 19:02

1 Answer 1

1

I'm sharing this onbehalf of @MikeM, who answered originaly to this question.

A\|\d*\|(?:(\d*)\^?)?(?:(\d*)\^?)?(?:(\d*)\^?)?(?:(\d*)\^?)?(?:(\d*))\|.+?\|Z

This regex matches all 3 inputs in the right group order. Thanks.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.