0
import pandas as pd
df = pd.read_csv('911.csv')
df['desc'].str.replace('[^a-zA-Z0-9]','').head()
0    REINDEER CT & DEAD END;  NEW HANOVER; Station ...
1    BRIAR PATH & WHITEMARSH LN;  HATFIELD TOWNSHIP...
2    HAWS AVE; NORRISTOWN; 2015-12-10 @ 14:39:21-St...
3    AIRY ST & SWEDE ST;  NORRISTOWN; Station 308A;...
4    CHERRYWOOD CT & DEAD END;  LOWER POTTSGROVE; S...
Name: desc, dtype: object

I'm trying to remove all non-alphanumerical characters in the desc column. I tried the same code with other columns but it doesn't seem to be working.

1
  • Keep in mind that [^a-zA-Z0-9] will also match for non-ASCII alphabetic characters for example this could turn Düsseldorf into Dsseldorf. If you want to preserve non-ASCII alphabetic character consider using \w rather than a-zA-Z. Commented Jul 24 at 12:15

1 Answer 1

2

It needs regex=True, That's all. Doc: Series.str.replace

.replace('[^a-zA-Z0-9]', '', regex=True)

BTW: it will remove also spaces between words, so maybe you should add also space.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.