161

I am trying to remove all spaces/tabs/newlines in python 2.7 on Linux.

I wrote this, that should do the job:

myString="I want to Remove all white \t spaces, new lines \n and tabs \t"
myString = myString.strip(' \n\t')
print myString

output:

I want to Remove all white   spaces, new lines 
 and tabs

It seems like a simple thing to do, yet I am missing here something. Should I be importing something?

3

8 Answers 8

185

Use str.split([sep[, maxsplit]]) with no sep or sep=None:

From docs:

If sep is not specified or is None, a different splitting algorithm is applied: runs of consecutive whitespace are regarded as a single separator, and the result will contain no empty strings at the start or end if the string has leading or trailing whitespace.

Demo:

>>> myString.split()
['I', 'want', 'to', 'Remove', 'all', 'white', 'spaces,', 'new', 'lines', 'and', 'tabs']

Use str.join on the returned list to get this output:

>>> ' '.join(myString.split())
'I want to Remove all white spaces, new lines and tabs'
Sign up to request clarification or add additional context in comments.

Comments

86

If you want to remove multiple whitespace items and replace them with single spaces, the easiest way is with a regexp like this:

>>> import re
>>> myString="I want to Remove all white \t spaces, new lines \n and tabs \t"
>>> re.sub('\s+',' ',myString)
'I want to Remove all white spaces, new lines and tabs '

You can then remove the trailing space with .strip() if you want to.

1 Comment

This is the cleanest solution
24

Use the re library

import re
myString = "I want to Remove all white \t spaces, new lines \n and tabs \t"
myString = re.sub(r"[\n\t\s]*", "", myString)
print myString

Output:

IwanttoRemoveallwhitespaces,newlinesandtabs

2 Comments

This is a correction of the original answer given by @TheGr8Adakron, not a duplicate
This does not preserve the spaces between the words rending the text useless for NLP.
14

This will only remove the tab, newlines, spaces and nothing else.

import re
myString = "I want to Remove all white \t spaces, new lines \n and tabs \t"
output   = re.sub(r"[\n\t\s]*", "", myString)

OUTPUT:

IwantoRemoveallwhiespaces,newlinesandtabs

Good day!

1 Comment

Thanks for the solution - I think a minor correction is needed, it should be '+' instead of '*'.
11
import re

mystr = "I want to Remove all white \t spaces, new lines \n and tabs \t"
print re.sub(r"\W", "", mystr)

Output : IwanttoRemoveallwhitespacesnewlinesandtabs

1 Comment

this also removes ';'
11

The above solutions suggesting the use of regex aren't ideal because this is such a small task and regex requires more resource overhead than the simplicity of the task justifies.

Here's what I do:

myString = myString.replace(' ', '').replace('\t', '').replace('\n', '')

or if you had a bunch of things to remove such that a single line solution would be gratuitously long:

removal_list = [' ', '\t', '\n']
for s in removal_list:
  myString = myString.replace(s, '')

1 Comment

Arguably this solution is the most readable and memorable.
3

How about a one-liner using a list comprehension within join?

>>> foobar = "aaa bbb\t\t\tccc\nddd"
>>> print(foobar)
aaa bbb                 ccc
ddd

>>> print(''.join([c for c in foobar if c not in [' ', '\t', '\n']]))
aaabbbcccddd

Comments

2

Since there is not anything else that was more intricate, I wanted to share this as it helped me out.

This is what I originally used:

import requests
import re

url = 'https://stackoverflow.com/questions/10711116/strip-spaces-tabs-newlines-python' # noqa
headers = {'user-agent': 'my-app/0.0.1'}
r = requests.get(url, headers=headers)
print("{}".format(r.content))

Undesired Result:

b'<!DOCTYPE html>\r\n\r\n\r\n    <html itemscope itemtype="http://schema.org/QAPage" class="html__responsive">\r\n\r\n    <head>\r\n\r\n        <title>string - Strip spaces/tabs/newlines - python - Stack Overflow</title>\r\n        <link

This is what I changed it to:

import requests
import re

url = 'https://stackoverflow.com/questions/10711116/strip-spaces-tabs-newlines-python' # noqa
headers = {'user-agent': 'my-app/0.0.1'}
r = requests.get(url, headers=headers)
regex = r'\s+'
print("CNT: {}".format(re.sub(regex, " ", r.content.decode('utf-8'))))

Desired Result:

<!DOCTYPE html> <html itemscope itemtype="http://schema.org/QAPage" class="html__responsive"> <head> <title>string - Strip spaces/tabs/newlines - python - Stack Overflow</title>

The precise regex that @MattH had mentioned, was what worked for me in fitting it into my code. Thanks!

Note: This is python3

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.