3

I am trying to read the IMF statistics into a pandas dataframe:

import pandas as pd
df = pd.read_table("http://www.imf.org/external/pubs/ft/weo/2013/02/weodata/WEOOct2013all.xls",
                   na_values=['n/a','--'],thousands=',')

All the columns, except one, have dtype object:

In [5]: df
Out[5]:
<class 'pandas.core.frame.DataFrame'>
Int64Index: 8318 entries, 0 to 8317
Data columns (total 49 columns):
...
dtypes: float64(1), object(48)

I manually inspected the file and could not find any value which is not a numeric one, or one of the NaN values explicitly mentioned in the code, in most columns.

I am using Python 2.7.5, numpy 1.7.1, pandas 0.11.0 on Anaconda 1.5.0 on Wakari.io.

4
  • 1
    Try writing code that iterates over the values in each column and calls float on each, and see where/if an exception is raised. Commented Dec 18, 2013 at 19:51
  • I can't, because the values have thousand separators. read_table should remove them (argument thousands of the function) Commented Dec 18, 2013 at 19:57
  • Okay, then write code that removes the thousands separators and then calls float. Alternatively, trim down your data file gradually until it starts working, then zero in on the difference that made it stop working. Commented Dec 18, 2013 at 19:59
  • 4
    IIRC this is a bug that was fixed in 0.12...though can't remember the ref ATM Commented Dec 18, 2013 at 20:02

1 Answer 1

8

As mentioned by Jeff, this was a bug in <=0.12 (but is fixed in 0.13).

In [11]: s = '''A;B
1;2,000
3;4'''

In [12]: pd.read_csv(StringIO(s), sep=';', thousands=',')
Out[12]: 
   A     B
0  1  2000
1  3     4

[2 rows x 2 columns]

In [13]: pd.version.version
Out[13]: '0.13.0rc1-82-g66934c2'
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.