1

In Chapter 2, Section 2.1 of Python Cookbook 3rd Edition, you have the following:

>>> line = 'asdf fjdk; afed, fjek,asdf,      foo'
>>> import re
>>> re.split(r'[;,\s]\s*', line)
['asdf', 'fjdk', 'afed', 'fjek', 'asdf', 'foo']

Yes it is a nice example... but when I try it out with removing the \s* in the regex it still has the same effect... see below:

>>> re.split(r'[;,\s]*', line)
['asdf', 'fjdk', 'afed', 'fjek', 'asdf', 'foo']

So, what does the author have in mind to make the redundant \s* useful for any other purposes than doing it without.. which is more simple and shorter?

Please make ur input.

2 Answers 2

3

I don't have the book, so I don't know the authors' intent. But David Beazley is as sharp as they come so I can only guess that it was to distinguish between the output for these two lines.

>>> line = 'asdf fjdk; afed, fjek,asdf,      foo'
>>> line = 'asdf fjdk; ; afed, fjek,asdf,      foo'

Using the regex from the book, the second line would be

['asdf', 'fjdk', '', 'afed', 'fjek', 'asdf', 'foo']

And using your modified regex

['asdf', 'fjdk', 'afed', 'fjek', 'asdf', 'foo']

Your regex will collapse all of the symbols in the group [;,\s] that are not separated by a character not in the match group.

Sign up to request clarification or add additional context in comments.

1 Comment

thank u for ur input.. excellent example for using the book's regex.
1

Both the regular expressions are different.

  • The first regex states that, the delimiter should either be a comma, semi-colon or a space optionally followed by spaces

  • The second regex states that, the delimiter should either one or more comma, semi-colon or a space.

So going by the definition, you can easily find the difference if you apply the regex on the following string

line = 'asdf fjdk;; afed, fjek,asdf,      foo'

So the results would now vary

>>> re.split(r'[;,\s]*', line)
['asdf', 'fjdk', 'afed', 'fjek', 'asdf', 'foo']
>>> re.split(r'[;,\s]\s*', line)
['asdf', 'fjdk', '', 'afed', 'fjek', 'asdf', 'foo']

Now, what is the regex you want depends on what is your input you would work on and what is the desired output for all the acceptable test cases.

1 Comment

thank u for ur input.. excellent example for using the book's regex.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.