0

I have a few columns that have numerical data with commas (eg. the number is stored as '4,200' and hence not being read as a number) in base file. To able to process the data I need to remove these commas from multiple columns of data.

import pandas as pd
import numpy as np
df = {'INR': ['4,200','5,000',0,'4,353','6,000',1],
'USD':['4,100','3,000','1,000','4,353','6,000',1]}
df = pd.DataFrame(df)

If I write the following line of code it works:

df['INR']=df['INR'].replace(',','').astype(int)

But the following line of code doesn't:

df[['INR','USD']]=df[['INR','USD']].replace(',','').astype(int)

Would be great if someone can help understand why

1
  • 8
    Use df[['INR','USD']].replace(',', '', regex=True) Commented Aug 22, 2020 at 7:43

1 Answer 1

1

Actually, when you call df[['INR','USD']], his type is pd.DataFrame. This type doesn't have ".replace" method. only pd.Series with this call :

df['INR'].astype(str).str.replace(",", "").astype(int)

Don't forget the .str.

For your case, use applymap method like that :

df[['INR','USD']] = df[['INR','USD']].applymap(lambda x: int(str(x).replace(",","")))

You can have information about applymap here : https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.applymap.html

Shortcut method by infering types (I prefer no shortcut method to understand what is doing...) :

df[['INR','USD']] = df[['INR','USD']].replace(',', '', regex=True)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.