1

I hope you can help me with this question. I have a column with numeric values as strings. Since they are data from diferent countries, some of them have different formats such as "," and "$". I'm trying to convert the serie to numbers, but i'm having trouble with "," and "$" values.

data={"valores":[1,1,3,"4","5.00","1,000","$5,700"]}
df=pd.DataFrame(data)
df

    valores
0   1
1   1
2   3
3   4
4   5.00
5   1,000
6   $5,700

Ive tried the following:

df["valores"].replace(",","")

but it does not change a thing since the "," value is in the string, not the string value itself

pd.to_numeric(df["valores"])

But I receive the "ValueError: Unable to parse string "1,000" at position 5" error.

valores=[i.replace(",","") for i in df["valores"].values]

But I receive the "AttributeError: 'int' object has no attribute 'replace' error.

So, at last, I tried with this:

valores=[i.replace(",","") for i in df["valores"].values if type(i)==str]
valores
['4', '5.00', '1000', '$5700']

But it skipped the first three values since they are not strings..

I think that with a Regex code i would be able to manage it, but I just simply dont understand how to work with it.

I hope you can help me since i've been struggling with this for about 7 hours.

4 Answers 4

1

You should first create a string from it, so something like this

valores=[str(i).replace(",","") for i in df["valores"].values]
Sign up to request clarification or add additional context in comments.

Comments

1

You can try this:

df['valores'] = df['valores'].replace(to_replace='[\,\$]',value='',regex=True).astype(float)

Comments

0

.replace by default searches for the whole cell values. Since you want to replace a part of the string, you need .str.replace or replace(...,regex=True):

df['valores'] = df["valores"].replace(",","", regex=True)

Or:

df['valore'] = df["valores"].str.replace(",","")

Comments

0

You need to cast the values in the valores column to string using .astype(str), then remove all $ and , using .str.replace('[,$]', '') and then you may convert all data to numeric using pd.to_numeric:

>>> pd.to_numeric(df["valores"].astype(str).str.replace("[,$]",""))
0       1.0
1       1.0
2       3.0
3       4.0
4       5.0
5    1000.0
6    5700.0

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.