19

I'm trying to figure out how I can repeat a capture group on the comma-separated values in this the following url string:

id=1,2;name=user1,user2,user3;city=Oakland,San Francisco,Seattle;zip=94553,94523;

I'm using this RegExp which is return results I want, except for the values since they're dynamic ie. could be 2,3,4,etc users in the url parameter and was wondering if I could create a capture group for each value instead of user1,user2,user3 as one capture-group.

RegExp: (^|;|:)(\w+)=([^;]+)*

Here is a live demo of it online using RegExp

Example Output:

  • Group1 - (semi-colon,colon)
  • Group2 - (key ie. id,name,city,zip)
  • Group3 - (value1)
  • Group4 - (value2) *if exists
  • Group5 - (value3) *if exists
  • Group6 - (value4) *if exists

etc... based on the dynamic values like I explained before.

Question: Whats wrong with my expression I'm using the * to loop for repeated patterns?

6
  • 1
    What is your expected output? I think this could be done without the use of a regexp. Commented Apr 17, 2017 at 23:55
  • 1
    Do you expect a result like: { "id": ["1", "2"], "name": ["user1", "user2", "user3"], "city": ["Oakland", "San Francisco", "Seattle"], "zip": ["94553", "94523"] }? Commented Apr 17, 2017 at 23:58
  • @ibrahimmahrir I gave example output above, the values are dynamic like user1,user2,etc... so basically want the each value in it's own capture-group Commented Apr 17, 2017 at 23:58
  • 1
    No! I'm talking about the final output not the output of the regex. How do you want the data to look at the end? Commented Apr 17, 2017 at 23:59
  • Is this what you trying to do regex101.com/r/2HQ8dv/2 Commented Apr 18, 2017 at 0:23

2 Answers 2

26

Regex doesn't support what you're trying to do. When the engine enters the capturing group a second time, it overwrites what it had captured the first time. Consider a simple example (thanks regular-expressions.info): /(abc|123)+/ used on 'abc123'. It will match "abc" then see the plus and try again, matching the "123". The final capturing group in the output will be "123".

This happens no matter what pattern you try and any limitation you set simply changes when the regex will accept the string. Consider /(abc|123){2}/. This accepts 'abc123' with the capturing group as "123" but not 'abc123abc'. Putting a capturing group inside another doesn't work either. When you create a capturing group, it's like creating a variable. It can only have one value and subsequent values overwrite the previous one. You'll never be able to have more capturing groups than you have parentheses pairs (you can definitely have fewer, though).

A possible fix then would be to split the string on ';', then each of those on '=', then the right-hand side of those on ','. That would get you [['id', '1', '2'], ['name', 'user1', ...], ['city', ...], ['zip', ...]].

That comes out to be:

function (str) {
  var afterSplit = str.split(';|:');
  afterSplit.pop() // final semicolon creates empty string
  for (var i = 0; i < afterSplit.length; i++) {
    afterSplit[i] = afterSplit[i].split('=');
    afterSplit[i][1] = afterSplit[i][1].split(','); // optionally, you can flatten the array from here to get something nicer
  }
  return afterSplit;
}
Sign up to request clarification or add additional context in comments.

2 Comments

Although capturing groups don't repeat, in some cases you can simply duplicate the capturing group. For instance say I'm parsing source code and i want to match a class declaration to get the implemented interfaces: Class X implements A, B, C, D. You can create the capture group (?:,\s+([^\s]+))? (matches zero or one time) and repeat it... (?:,\s+([^\s]+))?(?:,\s+([^\s]+))?(?:,\s+([^\s]+))? will now match up to 3 implemented classes. In python its even easier because you can do it like pattern = '(?:,\s+([^\s]+))?' * 3 etc/
"When you create a capturing group, it's like creating a variable. It can only have one value and subsequent values overwrite the previous one."
3

Capturing Group Repeated

String: !abc123def! regex: /!((abc|123|def)+)!/

Matchs:

Group 1: abc123def

Group 2: def

source: https://www.regular-expressions.info/captureall.html

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.