How to replace text in a string column of a Pandas dataframe?

Question

I have a column in my dataframe like this:

range
"(2,30)"
"(50,290)"
"(400,1000)"
...

and I want to replace the , comma with - dash. I'm currently using this method but nothing is changed.

org_info_exc['range'].replace(',', '-', inplace=True)

Can anybody help?

smci · Accepted Answer · 2021-12-22 08:20:22Z

510

Use the vectorised str method replace:

df['range'] = df['range'].str.replace(',','-')

df
      range
0    (2-30)
1  (50-290)

EDIT: so if we look at what you tried and why it didn't work:

df['range'].replace(',','-',inplace=True)

from the docs we see this description:

str or regex: str: string exactly matching to_replace will be replaced with value

So because the str values do not match, no replacement occurs, compare with the following:

df = pd.DataFrame({'range':['(2,30)',',']})
df['range'].replace(',','-', inplace=True)

df['range']

0    (2,30)
1         -
Name: range, dtype: object

here we get an exact match on the second row and the replacement occurs.

edited Dec 22, 2021 at 8:20

smci

34.2k21 gold badges118 silver badges152 bronze badges

answered Mar 11, 2015 at 12:22

EdChum

397k204 gold badges836 silver badges583 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

PV8 Over a year ago

I wonder why the warning occurs: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: pandas.pydata.org/pandas-docs/stable/user_guide/…

Adam Burke Over a year ago

It's because of changes to Copy-On-Write semantics in pandas 2, which will be the only copy mechanic in pandas 3 pandas.pydata.org/docs/user_guide/…

wjandrea · Accepted Answer · 2025-05-26 17:08:57Z

134

For anyone else arriving here from Google search on how to do a string replacement on all columns (for example, if one has multiple columns like the OP's 'range' column): Pandas has a built in replace method available on a dataframe object.

df.replace(',', '-', regex=True)

edited May 26 at 17:08

wjandrea

34k10 gold badges69 silver badges105 bronze badges

answered Nov 30, 2018 at 22:31

kevcisme

1,3711 gold badge8 silver badges9 bronze badges

1 Comment

therobyouknow Over a year ago

me too, in my example: price_df['Mid-price (p)'].replace(',','',regex=True,inplace=True) worked. Withoutregex=True, it didn't.

Nancy K · Accepted Answer · 2020-11-03 16:40:25Z

29

If you only need to replace characters in one specific column, somehow regex=True and in place=True all failed, I think this way will work:

data["column_name"] = data["column_name"].apply(lambda x: x.replace("characters_need_to_replace", "new_characters"))

lambda is more like a function that works like a for loop in this scenario. x here represents every one of the entries in the current column.

The only thing you need to do is to change the "column_name", "characters_need_to_replace" and "new_characters".

answered Nov 3, 2020 at 16:40

Nancy K

2913 silver badges2 bronze badges

2 Comments

Chris Over a year ago

This is what I ended up doing to replace, but I get the "A value is trying to be set on a copy of a slice from a DataFrame." warning. The resulting dataframe is exactly what I want, but the warning is off-putting. Rather than just turning the warning off, I'm in search of a non-warning-inducing way.

charles Over a year ago

Thanks! Also you need to do some conversion in parameter of lambda function, like all x to string using str(x)

adiga · Accepted Answer · 2019-06-20 06:28:11Z

10

Replace all commas with underscore in the column names

data.columns= data.columns.str.replace(' ','_',regex=True)

edited Jun 20, 2019 at 6:28

adiga

35.4k9 gold badges65 silver badges88 bronze badges

answered Jun 20, 2019 at 6:15

Rameez Ahmad

1331 silver badge2 bronze badges

Comments

cdutra · Accepted Answer · 2020-07-28 17:46:05Z

8

In addition, for those looking to replace more than one character in a column, you can do it using regular expressions:

import re
chars_to_remove = ['.', '-', '(', ')', '']
regular_expression = '[' + re.escape (''. join (chars_to_remove)) + ']'

df['string_col'].str.replace(regular_expression, '', regex=True)

answered Jul 28, 2020 at 17:46

cdutra

5876 silver badges15 bronze badges

Comments

Freddie · Accepted Answer · 2021-10-01 08:58:36Z

5

Almost similar to the answer by Nancy K, this works for me:

data["column_name"] = data["column_name"].apply(lambda x: x.str.replace("characters_need_to_replace", "new_characters"))

answered Oct 1, 2021 at 8:58

Freddie

1,0641 gold badge10 silver badges17 bronze badges

Comments

smci · Accepted Answer · 2021-12-22 08:14:49Z

2

If you want to remove two or more elements from a string, example the characters '$' and ',' :

Column_Name
===========
$100,000
$1,100,000

... then use:

data.Column_Name.str.replace("[$,]", "", regex=True)

=> [ 100000, 1100000 ]

edited Dec 22, 2021 at 8:14

smci

34.2k21 gold badges118 silver badges152 bronze badges

answered Aug 4, 2021 at 17:55

Leandro Marotti

211 bronze badge

Collectives™ on Stack Overflow

How to replace text in a string column of a Pandas dataframe?

7 Answers 7

2 Comments

1 Comment

2 Comments

Comments

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

7 Answers 7

2 Comments

1 Comment

2 Comments

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related