Split string on whitespace in Python [duplicate]

Question

I'm looking for the Python equivalent of

String str = "many   fancy word \nhello    \thi";
String whiteSpaceRegex = "\\s";
String[] words = str.split(whiteSpaceRegex);

["many", "fancy", "word", "hello", "hi"]

Sven Marnach · Accepted Answer · 2022-04-07 09:25:55Z

1227

The str.split() method without an argument splits on whitespace:

>>> "many   fancy word \nhello    \thi".split()
['many', 'fancy', 'word', 'hello', 'hi']

edited Apr 7, 2022 at 9:25

user3064538

answered Nov 13, 2011 at 18:46

Sven Marnach

608k123 gold badges968 silver badges865 bronze badges

Sign up to request clarification or add additional context in comments.

9 Comments

yak Over a year ago

Also good to know is that if you want the first word only (which means passing 1 as second argument), you can use None as the first argument: s.split(None, 1)

Raymond Hettinger Over a year ago

If you only want the first word, use str.partition.

user3527975 Over a year ago

@yak : Can you please edit your comment. The way it sounds right now is that s.split(None, 1) would return 1st word only. It rather gives a list of size 2. First item being the first word, second - rest of the string. s.split(None, 1)[0] would return the first word only

Sven Marnach Over a year ago

@galois No, it uses a custom implementation (which is faster). Also note that it handles leading and trailing whitespace differently.

Sven Marnach Over a year ago

@KishorPawar It's rather unclear to me what you are trying to achieve. Do you want to split on whitespace, but disregard whitespace inside single-quoted substrings? If so, you can look into shlex.split(), which may be what you are looking for. Otherwise I suggest asking a new question – you will get a much quicker and more detailed answer.

|

Óscar López · Accepted Answer · 2011-11-13 18:49:54Z

93

import re
s = "many   fancy word \nhello    \thi"
re.split('\s+', s)

answered Nov 13, 2011 at 18:49

Óscar López

237k38 gold badges321 silver badges391 bronze badges

3 Comments

Gulzar Over a year ago

this gives me a whitespace token at the end of the line. No idea why, the original line doesn't even have that. Maybe this ignores newline?

Óscar López Over a year ago

@Gulzar do a strip() at the end

Mark Jin Over a year ago

Note that this is usually slower than str.split if performance is an issue.

Rob Grossman · Accepted Answer · 2017-02-21 14:25:20Z

31

Using split() will be the most Pythonic way of splitting on a string.

It's also useful to remember that if you use split() on a string that does not have a whitespace then that string will be returned to you in a list.

Example:

>>> "ark".split()
['ark']

edited Feb 21, 2017 at 14:25

answered Feb 21, 2017 at 14:18

Rob Grossman

1,47013 silver badges22 bronze badges

Comments

Avinash Raj · Accepted Answer · 2017-11-29 07:39:27Z

22

Another method through re module. It does the reverse operation of matching all the words instead of spitting the whole sentence by space.

>>> import re
>>> s = "many   fancy word \nhello    \thi"
>>> re.findall(r'\S+', s)
['many', 'fancy', 'word', 'hello', 'hi']

Above regex would match one or more non-space characters.

edited Nov 29, 2017 at 7:39

answered Jun 17, 2015 at 18:33

Avinash Raj

175k32 gold badges247 silver badges289 bronze badges

Collectives™ on Stack Overflow

Split string on whitespace in Python [duplicate]

4 Answers 4

9 Comments

3 Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

9 Comments

3 Comments

Comments

Comments

Linked

Related