10

I want to convert all the string value in Pandas DataFrame into float, and I can define a short function to do this, but it's not a Pythonic way to do that. My DataFrame looks like this:

>>> df = pd.DataFrame(np.array([['1', '2', '3'], ['4', '5', '6']]))
>>> df
   0  1  2
0  1  2  3
1  4  5  6
>>> df.dtypes
0    object
1    object
2    object
dtype: object
>>> type(df[0][0])
<type 'str'>

I just wonder whether are there some built-in functions of Pandas DataFrame to convert all the string value to float. If you know the built-in function on the Pandas doc, please post the link.

2 Answers 2

12

Assuming all values can be correctly converted to float, you can use DataFrame.astype() function to convert the type of complete dataframe to float. Example -

df = df.astype(float)

Demo -

In [5]: df = pd.DataFrame(np.array([['1', '2', '3'], ['4', '5', '6']]))

In [6]: df.astype(float)
Out[6]:
   0  1  2
0  1  2  3
1  4  5  6

In [7]: df = df.astype(float)

In [8]: df.dtypes
Out[8]:
0    float64
1    float64
2    float64
dtype: object

.astype() function also has a raise_on_error argument (which defaults to True) which you can set to False to make it ignore errors . In such cases, the original value is used in the DataFrame -

In [10]: df = pd.DataFrame([['1', '2', '3'], ['4', '5', '6'],['blah','bloh','bleh']])

In [11]: df.astype(float,raise_on_error=False)
Out[11]:
      0     1     2
0     1     2     3
1     4     5     6
2  blah  bloh  bleh

To convert just a series/column to float, again assuming all values can be converted, you can use [Series.astype()][2] . Example -

df['somecol'] = df['somecol'].astype(<type>)
Sign up to request clarification or add additional context in comments.

1 Comment

Since the version 0.20.0 raise_on_error was replaced by errors
6

Another option is to use df.convert_objects(numeric=True). It attempts to convert numeric strings to numbers, with unconvertible values becoming NaN:

import pandas as pd

df = pd.DataFrame([['1', '2', '3'], ['4', '5', 'foo'], ['bar', 'baz', 'quux']])
df = df.convert_objects(convert_numeric=True)
print(df)

yields

    0   1   2
0   1   2   3
1   4   5 NaN
2 NaN NaN NaN

In contrast, df.astype(float) would raise ValueError: could not convert string to float: quux since in the above DataFrame some strings (such as 'quux') is not numeric.

Note: in future versions of pandas (after 0.16.2) the function argument will be numeric=True instead of convert_numeric=True.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.