After reading my csv file using read_csv() in Pandas, I want to convert some of the column dataypes to float64 for further processing, since they are currently represented as object dtype. Upon trying the attribute dtype in read_csv, I get the error. Following is the description:
import pandas as pd
file_ = pd.read_csv("/home/rahul/yearly_data_no_ecb.csv", dtype = {"DAX":"float64"})
Following is the full trace for the error:
ValueError Traceback (most recent call last)
<ipython-input-14-554c18573267> in <module>()
----> 1 file_ = pd.read_csv("/home/rahul/yearly_data_no_ecb.csv", dtype = {"DAX":"float64"})
2 #file1 = pd.to_numeric(file_)
3 file_.values
4 file_.dtypes
/home/rahul/anaconda/lib/python2.7/site-packages/pandas/io/parsers.pyc in parser_f(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, escapechar, comment, encoding, dialect, tupleize_cols, error_bad_lines, warn_bad_lines, skipfooter, skip_footer, doublequote, delim_whitespace, as_recarray, compact_ints, use_unsigned, low_memory, buffer_lines, memory_map, float_precision)
703 skip_blank_lines=skip_blank_lines)
704
--> 705 return _read(filepath_or_buffer, kwds)
706
707 parser_f.__name__ = name
/home/rahul/anaconda/lib/python2.7/site-packages/pandas/io/parsers.pyc in _read(filepath_or_buffer, kwds)
449
450 try:
--> 451 data = parser.read(nrows)
452 finally:
453 parser.close()
/home/rahul/anaconda/lib/python2.7/site-packages/pandas/io/parsers.pyc in read(self, nrows)
1063 raise ValueError('skipfooter not supported for iteration')
1064
-> 1065 ret = self._engine.read(nrows)
1066
1067 if self.options.get('as_recarray'):
/home/rahul/anaconda/lib/python2.7/site-packages/pandas/io/parsers.pyc in read(self, nrows)
1826 def read(self, nrows=None):
1827 try:
-> 1828 data = self._reader.read(nrows)
1829 except StopIteration:
1830 if self._first_chunk:
pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader.read()
pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._read_low_memory()
pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._read_rows()
pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._convert_column_data()
pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._convert_tokens()
ValueError: invalid literal for float(): 11,535,309,570.00
How do I convert the dtype of the columns which have numeric data to float64?
If I only read in the csv, and check the dtype of the columns,
file_ = pd.read_csv("/home/rahul/yearly_data_no_ecb.csv")
file_.dtypes
I get this:
Year int64
City object
Return office city center float64
Average return logistics float64
Inverse return houses float64
DAX object
MFI Interest Rate Germany float64
Inflation Rate float64
GDP (EUR) object
Size of City (km square) object
Total Population (Number) object
Population under 15 (Number) object
Population 15 to under 65 (Number) object
Population above 65 (Number) object
Total private households (Number) object
1 Person households (Number) object
2 Person households (Number) object
3 Person households (Number) object
4 Person households (Number) object
5 and more person households (Number) object
Total unemployment rate (Rate) float64
Total employment (Number) object
Available income per inhabitant (Eur) object
Total residential building (Number) object
Total Apartments (Number) object
Total new residential building approvals (Number) object
Total new residential building completions (Number) object
Total Migration object
Returns float64
Class float64
dtype: object
Basically, I want to convert the dtype (to float64) of columns DAX, GDP to 5 or more person households (Nuumber) and Total employment (Number) to Total Migration.
Thanks.