113

I'm a beginner with both Python and RegEx, and I would like to know how to make a string that takes symbols and replaces them with spaces. Any help is great.

For example:

how much for the maple syrup? $20.99? That's ricidulous!!!

into:

how much for the maple syrup 20 99 That s ridiculous
2
  • My advice is to read the documentation for the re library. It includes some pretty good examples. Commented May 18, 2009 at 2:00
  • 15
    Strange this is marked as a duplicate of a question asked over a year later. Commented Jan 30, 2014 at 0:52

3 Answers 3

212

One way, using regular expressions:

>>> s = "how much for the maple syrup? $20.99? That's ridiculous!!!"
>>> re.sub(r'[^\w]', ' ', s)
'how much for the maple syrup   20 99  That s ridiculous   '
  • \w will match alphanumeric characters and underscores

  • [^\w] will match anything that's not alphanumeric or underscore

Sign up to request clarification or add additional context in comments.

6 Comments

It should be noted that ^\w outside of brackets means 'match an alphanumeric character at the beginning of a line'. It's only within the brackets ( [^\w] ) that the caret symbol means 'ignore every character in here'
in stead of [^\w] you can also use \W, which is the opposite of \w.
Infect [/\W+/g] will do the magic.
will this work for a string containing 'é' character. whether the output will retain or remove this character?
r'[^\w]' equals r'\W'
|
36

Sometimes it takes longer to figure out the regex than to just write it out in python:

import string
s = "how much for the maple syrup? $20.99? That's ricidulous!!!"
for char in string.punctuation:
    s = s.replace(char, ' ')

If you need other characters you can change it to use a white-list or extend your black-list.

Sample white-list:

whitelist = string.letters + string.digits + ' '
new_s = ''
for char in s:
    if char in whitelist:
        new_s += char
    else:
        new_s += ' '

Sample white-list using a generator-expression:

whitelist = string.letters + string.digits + ' '
new_s = ''.join(c for c in s if c in whitelist)

2 Comments

I just used this whitelist method for a project I'm working on. Thanks!
+1, pythonic, love it.
12

I often just open the console and look for the solution in the objects methods. Quite often it's already there:

>>> a = "hello ' s"
>>> dir(a)
[ (....) 'partition', 'replace' (....)]
>>> a.replace("'", " ")
'hello   s'

Short answer: Use string.replace().

1 Comment

I think this answer is not complete looking at the problem

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.