Converting string variable with double commas into float?

Question

I have some strings in a column which originally uses commas as separators from thousands and from decimals and I need to convert this string into a float, how can I do it?

I firstly tried to replace all the commas for dots:

df['min'] = df['min'].str.replace(',', '.')

and tried to convert into float:

df['min']= df['min'].astype(float)

but it returned me the following error:

ValueError                                Traceback (most recent call last)
<ipython-input-29-5716d326493c> in <module>
----> 1 df['min']= df['min'].astype(float)
      2 #df['mcom']= df['mcom'].astype(float)
      3 #df['max']= df['max'].astype(float)

~\anaconda3\lib\site-packages\pandas\core\generic.py in astype(self, dtype, copy, errors)
   5544         else:
   5545             # else, only a single dtype is given
-> 5546             new_data = self._mgr.astype(dtype=dtype, copy=copy, errors=errors,)
   5547             return self._constructor(new_data).__finalize__(self, method="astype")
   5548 

~\anaconda3\lib\site-packages\pandas\core\internals\managers.py in astype(self, dtype, copy, errors)
    593         self, dtype, copy: bool = False, errors: str = "raise"
    594     ) -> "BlockManager":
--> 595         return self.apply("astype", dtype=dtype, copy=copy, errors=errors)
    596 
    597     def convert(

~\anaconda3\lib\site-packages\pandas\core\internals\managers.py in apply(self, f, align_keys, **kwargs)
    404                 applied = b.apply(f, **kwargs)
    405             else:
--> 406                 applied = getattr(b, f)(**kwargs)
    407             result_blocks = _extend_blocks(applied, result_blocks)
    408 

~\anaconda3\lib\site-packages\pandas\core\internals\blocks.py in astype(self, dtype, copy, errors)
    593             vals1d = values.ravel()
    594             try:
--> 595                 values = astype_nansafe(vals1d, dtype, copy=True)
    596             except (ValueError, TypeError):
    597                 # e.g. astype_nansafe can fail on object-dtype of strings

~\anaconda3\lib\site-packages\pandas\core\dtypes\cast.py in astype_nansafe(arr, dtype, copy, skipna)
    993     if copy or is_object_dtype(arr) or is_object_dtype(dtype):
    994         # Explicit copy, or required since NumPy can't view from / to object.
--> 995         return arr.astype(dtype, copy=True)
    996 
    997     return arr.view(dtype)

ValueError: could not convert string to float: '1.199.75'

If it is possible, I would like to remove all dots and commas and then add the dots before the last two characters from the variables before converting into float.

Input:

df['min'].head()

Expected output:

so you want to remove all dots and add dot before two characters? — demetere._
– demetere._, Commented Jun 15, 2022 at 13:21
Can you please add an example input and expected output to assist in answering — user157545
– user157545, Commented Jun 15, 2022 at 13:24
@mozway the dataframe originally has commas as separators from thousands and decimals, this command didn't work — Vinícius Franca
– Vinícius Franca, Commented Jun 15, 2022 at 13:25

mozway · Accepted Answer · 2022-06-15 13:32:33Z

1

If you always have 2 decimal digits:

df['min'] = pd.to_numeric(df['min'].str.replace('.', '', regex=False)).div(100)

output (as new column min2 for clarity):

        min     min2
0      9.50     9.50
1     10.00    10.00
2      3.45     3.45
3  1.095.50  1095.50
4     13.25    13.25

answered Jun 15, 2022 at 13:32

mozway

267k13 gold badges56 silver badges106 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Sunderam Dubey · Accepted Answer · 2022-06-16 02:51:02Z

1

Try this:

df['min'] = df['min'].str.replace(',', '')
df['min'] = df['min'].str[:-2] + '.' + df['min'].str[-2:]

df['min']= df['min'].astype(float)

edited Jun 16, 2022 at 2:51

Sunderam Dubey

8,83312 gold badges26 silver badges43 bronze badges

answered Jun 15, 2022 at 13:27

demetere._

4603 silver badges15 bronze badges

Comments

Léa Gris · Accepted Answer · 2022-06-15 14:03:42Z

0

I have some strings in a column which originally uses commas as separators from thousands and from decimals and I need to convert this string into a float

So lets produce a reproducible data source which conforms to your description:

df = {'min': '0123,456,78'}

Then splits this on "," into a list:

split_str = df['min'].split(',')

Collects integer and decimal parts separately:

int_str = ''.join(split_str[:-1])
dec_str = split_str[-1]

And finally reconstructs a valid float string; and convert it to an actual float number:

float_number = float(f"{int_str}.{dec_str}")

edited Jun 15, 2022 at 14:03

answered Jun 15, 2022 at 13:53

Léa Gris

20.2k4 gold badges39 silver badges52 bronze badges

Collectives™ on Stack Overflow

Converting string variable with double commas into float?

3 Answers 3

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related