0

Getting an error "OverflowError: long int too large to convert" when directly putting a table into pandas DataFrame. This seems to be due to the large numbers contained in the table, but there is no error trying on https://www.pythonanywhere.com/try-ipython/. I've added a workaround to convert the values to float and THEN created the DataFrame.

import pandas as pd
table = [{'two': 2, 'one': 1}, {'two': 22, 'one': 11}, 
    {'two': 222, 'one': 1111111111111111111111111111111111111111111111111111111111111111111111}]

# workaround for overflow error
for x, i in enumerate(table):
    table[x]['one']=float(table[x]['one'])

df = pd.DataFrame(table)

Is there a better way to do this? Others have pointed out that they do not get any overflow error. This is python 2.7

1 Answer 1

3

By default, pandas tries to read and understand your data, and converts it to an appropriate datatype. In your case, it tried to load the data into np.float64 objects. However, your data is obviously too large.

One workaround is to specify dtype=object when creating the dataframe.

df = pd.DataFrame(table, dtype='object')    
df
                                                 one  two
0                                                  1    2
1                                                 11   22
2  1111111111111111111111111111111111111111111111...  222 

Note that doing this kills all possibility of speed and efficiency, since objects are very slow to work with. I assume you are prepared for this, working with data of this nature.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks! Ideally we could specify dtype only on a specific columns, but appears to be an issue still github.com/pandas-dev/pandas/issues/4464

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.