Deleting numbers in a string using regex

Question

Replacing numbers with a placeholder in a string inclding decimals and percentages using re in Python

def remove_numbers(text):
    remove = re.sub(r"\W\d\S*", " [DD]", text,)
    return remove

The function works fine on this sample string. sample = "I can give you 10% of 100,000 to you. The thing went up by 10% so it costs 12.25 euros now. But if a string starts with a number, the first numer does not get replaced by the placeholder.

where did i work perfectly, can you add that, also add more example of input and output — Devesh Kumar Singh
– Devesh Kumar Singh, Commented Jun 17, 2019 at 18:47
sample = I can give 50% of 100,000 to you in cash. it went up by 2.3% and its costly. — user11652296
– user11652296, Commented Jun 17, 2019 at 18:50
it worked on that string perfectly, but if the number is at the start of the string it dosent seem to work — user11652296
– user11652296, Commented Jun 17, 2019 at 18:50
What the expected output for I can give 50% of 100,000 to you in cash. it went up by 2.3% and its costly ? — Devesh Kumar Singh
– Devesh Kumar Singh, Commented Jun 17, 2019 at 18:53

Rashid 'Lee' Ibrahim · Accepted Answer · 2019-06-17 18:49:40Z

1

So looping through the replace method seems to be the easiest way to do this.

def remove_numbers(text):
    nums = '123456787980'
    for i in nums:
        text = text.replace(i, '[DD]')

    return text

answered Jun 17, 2019 at 18:49

Rashid 'Lee' Ibrahim

1,3921 gold badge9 silver badges27 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Mike Clark · Accepted Answer · 2019-06-17 18:55:09Z

\W will not match at the start of string. It appears you are using \W to make sure that the number you are replacing is not a part of a word. This makes sense. But, \W doesn't match at start-of-string. You can use \A for that. But, you probably don't want to add a space when you are replacing at start-of-string. This can be done in a single regex, but I think it results in easier-to-read code if you do it in two steps.

import re

def remove_numbers(text):
    # replace internal numbers that are not a part of a word (adds a space)
    remove = re.sub(r"\W\d\S*", " [DD]", text,)
    # replace number at start of string (if any) (does not add a space)
    remove = re.sub(r"\A\d\S*", "[DD]", remove,)
    return remove

a = "3 foxes jumped over 3 fences"
b = remove_numbers(a)

print("before <{}>".format(a))
print("after <{}>".format(b))

APerson · Accepted Answer · 2019-06-17 18:49:48Z

0

\W requires a character to be there, so when you try it with a number at the beginning it'll look like just \d\S*.

Use '\b' instead of '\w' to match word boundaries:

def remove_numbers(text):
    remove = re.sub(r"\b\d\S*", "[DD]", text,)
    return remove

Or, keeping more in the spirit of your original code:

def remove_numbers(text):
    remove = re.sub(r"(\s|^)\d\S*", r"\1[DD]", text,)
    return remove

And use \d+ instead of \d if you want to also match multiple digits in a row.

answered Jun 17, 2019 at 18:49

APerson

8,4588 gold badges39 silver badges49 bronze badges

Comments

Perplexabot · Accepted Answer · 2019-06-17 18:49:57Z

0

Do this:

import re
def remove_numbers(text):
    remove = re.sub(r"\W?\d\S*", " [DD]", text,)
    return remove.strip()

print(remove_numbers())

The ? means 0 or more of the previous pattern

answered Jun 17, 2019 at 18:49

Perplexabot

2,0094 gold badges20 silver badges22 bronze badges

Comments

Omer Tekbiyik · Accepted Answer · 2019-06-17 18:50:37Z

0

Change your regex to :

    remove = re.sub("^\d+\s|\s\d+\s|\s\d+$", " [DD] ", text)

All code :

import re
def remove_numbers(text):
    s = re.sub("^\d+\s|\s\d+\s|\s\d+$", " [DD] ", text)

    return s

t1 = "3 foxes jumped over 3 fences"
print (remove_numbers(t1))

Output :

[DD] foxes jumped over [DD] fences

answered Jun 17, 2019 at 18:50

Omer Tekbiyik

4,8041 gold badge19 silver badges29 bronze badges

Collectives™ on Stack Overflow

Deleting numbers in a string using regex

5 Answers 5

Comments

Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

Comments

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related