0

I am having trouble in finding the way to create a proper regex that replaces in a string anything different from a-z, A-Z, 0-9 and |.

Until now, I have:

 re.sub(r'[\W_]+', '', s)

This works fine for alphanumerical characters, but I also need to make it work for pipe ("|"). Does anyone know how to make it?

For example, if I have:

name&|surna.me|ag,e

I need it like:

name|surname|age
0

1 Answer 1

5

anything different from a-z, A-Z, 0-9 and |.

You need a negated character class:

s = re.sub(r'[^a-zA-Z0-9|]+', '', s)

A [ starts the character class, ^ tells the regex engine that it should match anything other than what is defined inside the class, and the rest are your ranges/chars. ] closes the class and + makes it match 1 or more occurrences of the required chars.

See the regex demo and a Python demo:

import re
s = 'name&|surna.me|ag,e'
s = re.sub(r'[^a-zA-Z0-9|]+', '', s)
print(s)
# => name|surname|age
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.