0

I have python code like this:

a = 'xyxy123'
b = re.findall('x*',a)
print b

This is the result:

['x', '', 'x', '', '', '', '', '']  

How come b has eight elements when a only has seven characters?

2 Answers 2

1

There are eight "spots" in the string:

|x|y|x|y|1|2|3|

Each of them is a location where a regex could start. Since your regex includes the empty string (because x* allows 0 copies of x), each spot generates one match, and that match gets appended to the list in b. The exceptions are the two spots that start a longer match, x; as in msalperen's answer,

Empty matches are included in the result unless they touch the beginning of another match,

so the empty matches at the first and third locations are not included.

Sign up to request clarification or add additional context in comments.

Comments

0

According to python documentation (https://docs.python.org/2/library/re.html):

re.findall returns all non-overlapping matches of pattern in string, as a list of strings. The string is scanned left-to-right, and matches are returned in the order found. If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group. Empty matches are included in the result unless they touch the beginning of another match.

So it returns all the results that match x*, including the empty ones.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.