4

How can I remove specific whitespace within a string in python.

My input string is:,

str1 = """vendor_id\t: GenuineIntel
        cpu family\t: 6
        model\t\t: 58
        model name\t: Intel(R) Core(TM) i3-3120M CPU @ 2.50GHz
        stepping\t: 9
        cpu MHz\t\t: 2485.659
        cache size\t: 6144 KB
        fpu\t\t: yes
        fpu_exception\t: yes
        cpuid level\t: 5
        wp\t\t: yes"""

My required output is:

>>>print str1
vendor_id: GenuineIntel
cpu family: 6
model: 58
model name: Intel(R) Core(TM) i3-3120M CPU @ 2.50GHz
stepping: 9
cpu MHz: 2485.659
cache size: 6144 KB
fpu: yes
fpu_exception: yes
cpuid level: 5
wp: yes
2
  • 1
    Are you sure your output string doesn't have any space in it. Commented Feb 23, 2014 at 9:46
  • 1
    I've edited your sample to a) be valid Python, and b) replace the existing tabs in it with \t characters, since that's what is important here but wasn't visible. Please do use print repr(str1) to show us what is really in there if you can. Commented Feb 23, 2014 at 10:06

5 Answers 5

7

Looks like you want to remove the whitespace from the start of lines, and remove all whitespace before a colon. Use regular expressions:

import re

re.sub(r'(^[ \t]+|[ \t]+(?=:))', '', str1, flags=re.M)

This picks out spaces and tabs at the start of lines (^[ \t]*, ^ is the start of a line, [ \t] is a space or tab, + is 1 or more), or it picks out spaces and tabs right before a colon ([ \t]+ is 1 or more spaces and tabs, (?=:) means that a : character must follow but isn't included in what is picked) and then replaces those spaces and tabs with an empty string. The flags=re.M is there to make sure the pattern works on each individual line.

Demo:

>>> import re
>>> str1 = """vendor_id\t: GenuineIntel
...         cpu family\t: 6
...         model\t\t: 58
...         model name\t: Intel(R) Core(TM) i3-3120M CPU @ 2.50GHz
...         stepping\t: 9
...         cpu MHz\t\t: 2485.659
...         cache size\t: 6144 KB
...         fpu\t\t: yes
...         fpu_exception\t: yes
...         cpuid level\t: 5
...         wp\t\t: yes"""
>>> print re.sub(r'(^[ \t]+|[ \t]+(?=:))', '', str1, flags=re.M)
vendor_id: GenuineIntel
cpu family: 6
model: 58
model name: Intel(R) Core(TM) i3-3120M CPU @ 2.50GHz
stepping: 9
cpu MHz: 2485.659
cache size: 6144 KB
fpu: yes
fpu_exception: yes
cpuid level: 5
wp: yes

If your input string does not have leading whitespace (and you just indented your sample yourself to make it look lined up), then all you want to remove is tabs:

str1 = str1.replace('\t', '')

and be done with it.

Sign up to request clarification or add additional context in comments.

Comments

4

I don't know what you mean by "randomly", but you can remove all tabs with:

str1 = str1.replace("\t", "")

1 Comment

That won't be enough here if the leading whitespace or whitespace before the : characters contains regular spaces too.
2

This will solve your answer:

str1 = """vendor_id\t: GenuineIntel
    cpu family\t: 6
    model\t\t: 58
    model name\t: Intel(R) Core(TM) i3-3120M CPU @ 2.50GHz
    stepping\t: 9
    cpu MHz\t\t: 2485.659
    cache size\t: 6144 KB
    fpu\t\t: yes
    fpu_exception\t: yes
    cpuid level\t: 5
    wp\t\t: yes"""
arr = [line.strip() for line in str1.split('\n')]
for line in arr:
    print line.strip()

Comments

1
def invitation_ics():
    text = f"""BEGIN:VCALENDAR
CLASS:PUBLIC
STATUS:CONFIRMED
    """ # text
return text

out not tab Output: BEGIN:VCALENDAR CLASS:PUBLIC STATUS:CONFIRMED

Comments

0
str1 = str1.replace("\t", "").replace(" ", "")

it would replace the tabs first and then white spaces.

1 Comment

This replaces all whitespace with nothing. The spaces after the colon and in between the words should be preserved.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.