1

Say I've got a column in my Pandas Dataframe that looks like this:

s = pd.Series(["ab-cd.", "abc", "abc-def/", "ab.cde", "abcd-"])

I would like to use this column for fuzzy matching and therefore I want to remove characters ('.' , '/' , '-') but only at the end of each string so it looks like this:

s = pd.Series(["ab-cd", "abc", "abc-def", "ab.cde", "abcd"])

So far I started out easy so instead of generating a list with characters I want removed I just repeated commands for different characters like:

if s.str[-1] == '.':
  s.str[-1].replace('.', '')

But this simply produces an error. How do I get the result I want, that is strings without characters at the end (characters in the rest of the string need to be preserved)?

4 Answers 4

4

Replace with regex will help you get the output

s.replace(r'[./-]$','',regex=True)

or with the help of apply incase looking for an alternative

s.apply(lambda x :x[:-1] if x[-1] is '.' or '-' or '/' else x) 
0      ab-cd
1        abc
2    abc-def
3     ab.cde
4       abcd
dtype: object
Sign up to request clarification or add additional context in comments.

1 Comment

Glad to help @MichielV. . If my answer was helpful, don't forget accept it - click on the check mark (✓) beside the answer to toggle it from greyed out to filled in. Happy learning.
0

You can use str.replace with a regex:

>>> s = pd.Series(["ab-cd.", "abc", "abc-def/", "ab.cde", "abcd-"])
>>> s.str.replace("\.$|/$|\-$","")
0      ab-cd
1        abc
2    abc-def
3     ab.cde
4       abcd
dtype: object
>>> 

which can be reduced to this:

>>> s.str.replace("[./-]$","")
0      ab-cd
1        abc
2    abc-def
3     ab.cde
4       abcd
dtype: object
>>> 

1 Comment

Many thanks to you MedAli, I can now continue my project!
0

You can use str.replace with a regular expression

s.str.replace(r'[./-]$','')

Substitute inside [./-] any characters you want to replace. $ means the match should be at the end of the string.

To replace "in-place" use Series.replace

s.replace(r'[./-]$','', inplace=True, regex=True)

1 Comment

Thank you for your quick reply, this solved my problem right away!
0

I was able to remove characters from the end of strings in a column in a pandas DataFrame with the following line of code:

s.replace(r'[./-]$','',regex=True)

Where all entries in between brackets ( [./-] ) indicate characters to be removed and $ indicate they should be removed from the end

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.