5

I'd like to extract the designator and ops from the string designator: op1 op2, in which there could be 0 or more ops and multiple spaces are allowed. I used the following regular expression in Python

import re
match = re.match(r"^(\w+):(\s+(\w+))*", "des1: op1   op2")

The problems is that only des1 and op2 is found in the matching groups, op1 is not. Does anyone know why?

The groups from above code is
Group 0: des1: op1 op2
Group 1: des1
Group 2:  op2
Group 3: op2
0

3 Answers 3

4

both are 'found', but only one can be 'captured' by the group. if you need to capture more than one group, then you need to use the regular expression functionality multiple times. You could do something like this, first by rewriting the main expression:

match = re.match(r"^(\w+):(.*)", "des1: op1   op2")

then you need to extract the individual subsections:

ops = re.split(r"\s+", match.groups()[1])[1:]
Sign up to request clarification or add additional context in comments.

3 Comments

what's the difference with OP's regex?
sorry i accidentially submitted before finishing the post.
ah, no bother. but if you go with two regexes, wouldn't it be more efficient just to use string methods?
4

I don't really see why you'd need regex, it's quite simple to parse with string methods:

>>> des, _, ops = 'des1: op1   op2'.partition(':')
>>> ops
' op1   op2'
>>> ops.split()
['op1', 'op2']

1 Comment

I didn't consider that split() could be used to split components separated with multiple spaces. I believe this also works. Thanks!
0

I'd do sth like this:

>>> import re
>>> tokenize = re.compile(flags=re.VERBOSE, pattern="""
...     (?P<de> \w+ (?=:) ) |
...     (?P<op> \w+)
... """).finditer
... 
>>> 
>>> for each in tokenize("des1: op1   op2"):
...     print each.lastgroup, ':', each.group()
...
de : des1
op : op1
op : op2

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.