38

I have a Python regular expression that contains a group which can occur zero or many times - but when I retrieve the list of groups afterwards, only the last one is present. Example:

re.search("(\w)*", "abcdefg").groups()

this returns the list ('g',)

I need it to return ('a','b','c','d','e','f','g',)

Is that possible? How can I do it?

2 Answers 2

40
re.findall(r"\w","abcdefg")
Sign up to request clarification or add additional context in comments.

1 Comment

+1: You can't do it with a single regex capture; you have to do it another way.
33

In addition to Douglas Leeder's solution, here is the explanation:

In regular expressions the group count is fixed. Placing a quantifier behind a group does not increase group count (imagine all other group indexes increment because an eralier group matched more than once).

Groups with quantifiers are the way of making a complex sub-expression atomic, when there is need to match it more than once. The regex engine has no other way than saving the last match only to the group. In short: There is no way to achieve what you want with a single "unarmed" regular expression, and you have to find another way.

2 Comments

As an addition: Modern regex implementations like the one in .NET allow you to access previous occurrences of a group besides the last one. Therefore, the above statement is not univerally true, but still holds for the most implementations.
For the record, there's a regex implementation for Python which also permits access to all of the matches of a capture group: pypi.python.org/pypi/regex

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.