0

Is it possible to match more than one optional regex to a string, in any order (but retrievable in a specific order?)

For example,

s = '(pattern1)(pattern2)(pattern3)'

such that

match = re.search(s, 'pattern2 pattern1')
match = re.search(s, 'pattern1 pattern3 pattern2')
match = re.search(s, 'pattern3 pattern1')

and every other permutation match, and furthermore

match.groups()

always returns pattern1, pattern2, pattern3 in the same order, even if one or more are None

I know this sounds unlikely -- just wondering if and how it can be done.

1
  • are you looking for s = 'pattern1|pattern2|pattern3' where the pipe separate your options? Commented Feb 18, 2013 at 21:47

2 Answers 2

1

Do you mean

s = '(pattern1|pattern2|pattern3)'
match = sorted(re.findall(s, 'pattern1 pattern3 pattern2'))
match
>>> ['pattern1', 'pattern2', 'pattern3']

?

Sign up to request clarification or add additional context in comments.

Comments

0

step one read the documentation for itertools and see what itertools magic will generate the kind of sets of matches you want. For example

>>> import itertools
>>> a=['aaa','bbb','ccc']
>>> for q in itertools.permutations(a):
...   print q
... 
('aaa', 'bbb', 'ccc')
('aaa', 'ccc', 'bbb')
('bbb', 'aaa', 'ccc')
('bbb', 'ccc', 'aaa')
('ccc', 'aaa', 'bbb')
('ccc', 'bbb', 'aaa')

To ensure that the matches are returned in a consistent manner tag each part of the regexp with ?P for example

>>> rl=[]
>>> bigr=""
>>> for q in itertools.permutations(a):
...   r=""
...   for ms in q:
...     r = r + "(?P<" + ms + ">" + ms + ")"
...   rl.append(r)
... 
>>> rl
['(?P<aaa>aaa)(?P<bbb>bbb)(?P<ccc>ccc)', '(?P<aaa>aaa)(?P<ccc>ccc)(?P<bbb>bbb)', '(?P<bbb>bbb)(?P<aaa>aaa)(?P<ccc>ccc)', '(?P<bbb>bbb)(?P<ccc>ccc)(?P<aaa>aaa)', '(?P<ccc>ccc)(?P<aaa>aaa)(?P<bbb>bbb)', '(?P<ccc>ccc)(?P<bbb>bbb)(?P<aaa>aaa)']

In the example above I've used the match strings as the id tags in the P part of the expression. You could generate a "name1" "name2" or similar instead

Finally, join up all the little regexps into one giant regexp

onegiantregexp = "|".join(rl)

And use something like the re module "groupdict" to get the results

Hope this helps

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.