1

I am trying to read a csv file using panda, this is how the data looks while in the csv file,

Freq Level
2412 -84
2412 -85 
2412 -90
2412 -83
2412 -83

Here is my code:

import pandas as pd

    x_data = pd.read_csv(data_path, encoding='utf7', dtype=float)
    print(x_data)

then I get the error "cannot safely convert passed use dtype of float64 for object dtyped data"

~/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py in read(self, nrows)
   2057     def read(self, nrows=None):
   2058         try:
-> 2059             data = self._reader.read(nrows)
   2060         except StopIteration:
   2061             if self._first_chunk:

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader.read()

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._read_low_memory()

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._read_rows()

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._convert_column_data()

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._convert_tokens()

ValueError: cannot safely convert passed user dtype of float64 for object dtyped data in column 1

But if I try without the 'dtype = float' in code:

import pandas as pd

    x_data = pd.read_csv(data_path, encoding='utf7')
    print(x_data)

then I get the data but with ' ' in third column,

[[37710 2432 '-72']

[931 2412 '-73']

[10936 2412 '-66']

...

[48037 2437 '-73']

[84317 2467 '-67']

[33201 2427 '-79']]

and I guess ' ' is what causes error cause it makes pd reader converts the data from an object or str to float.

How can I get rid that ' ' for the third column data, auto float converted, and also not the error please?

2
  • Actually, can you figure out what value is causing the problem? If not, you can always coerce the column to numeric with pd.to_numeric. Commented Feb 22, 2020 at 14:46
  • Thanks, I just tried. An error came up say "arg must be a list , tuple, 1-d array, or Series", any? And the value that causes the problem I am sure is the third coulmn '-72', '-73', '-66',...,'-73',' '-67'... Commented Feb 22, 2020 at 14:55

1 Answer 1

2

The encoding='utf7' looks redundant to me but it's not the cause of your error. You are reading space-separated data while read_csv expects comma-separated by default.

Add sep='\s+':

x_data = pd.read_csv(data_path, encoding='utf7', dtype=float, sep='\s+')
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks, and here is what I got: [[37710 '2432,-72'] [931 '2412,-73'] [10936 '2412,-66'] ... [48037 '2437,-73'] [84317 '2467,-67'] [33201 '2427,-79']] , I though the ' and , are supposed to disappear though..

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.