1

How to remove a substring from the following string 'AA.01.001 Hello' with regular expression using Python.

I tried the following but, not worked.

array= ['AB.01.001 Hello','BA.10.004','CD.10.015 Good bye']
 regex = re.compile(r'[A-Z]{2,3}\\.[0-9]{2}\\.[0-9]{3}')
filtered = filter(lambda i: not regex.search(i), array)

Edit:

Exprected output : [`Hello`,'Good bye']
3
  • I think the pattern works right? Just add \s* after it to account for the whitespaces and use a single backslash. regex101.com/r/Jkht46/1 See Python demo ideone.com/rZfGLP Commented Jul 7, 2019 at 13:31
  • can you add some more examples to it, will the string to replace is always in this format 'AA.01.001 Commented Jul 7, 2019 at 13:31
  • I think the issue is just the double backslashes. You are using a raw string. So you only need one backslash. Commented Jul 7, 2019 at 13:36

1 Answer 1

1

You may use re.sub:

import re
array= ['AB.01.001 Hello','BA.10.004','CD.10.015 Good bye']
regex = re.compile(r'[A-Z]{2,3}\.[0-9]{2}\.[0-9]{3}\s*')
filtered = filter(None, map(lambda i: regex.sub('', i), array))
print(list(filtered))
# => ['Hello', 'Good bye']

See the Python demo.

Note I used only one \ to escape the literal dots (as you are using a raw string literal to define the regex) and added \s* to also remove any 0 or more whitespaces after your pattern.

Details

  • map(lambda i: regex.sub('', i), array) - iterates over the array items and removes the matches with re.sub
  • filter(None, ...) removes the empty items resulting from the substitution when the pattern matches the whole string.
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.