19

I'm trying to read a csv file with pandas.

This file actually has only one row but it causes an error whenever I try to read it.

Something wrong seems happening in line 8 but I could hardly find the 8th line since there's clearly only one row on it.

I do like:

with codecs.open("path_to_file", "rU", "Shift-JIS", "ignore") as file:

df = pd.read_csv(file, header=None, sep="\t")
df

Then I get:

ParserError: Error tokenizing data. C error: Expected 1 fields in line 8, saw 3

I don't get what's really going on, so any of your advice will be appreciated.

5 Answers 5

25

I struggled with this almost a half day , I opened the csv with notepad and noticed that separate is TAB not comma and then tried belo combination.

df = pd.read_csv('C:\\myfile.csv',sep='\t', lineterminator='\r')
Sign up to request clarification or add additional context in comments.

Comments

7

Try df = pd.read_csv(file, header=None, error_bad_lines=False)

3 Comments

Thanks so much fo your comment Po Xin, I've tried that and got another error like this ParserError: Error tokenizing data. C error: Buffer overflow caught - possible malformed input file.
How to avoid showing errors in terminal furthermore?
4

The existing answer will not include these additional lines in your dataframe. If you'd like your dataframe to be as wide as its widest point, you can use the following:

delimiter = ','
max_columns = max(open(path_name, 'r'), key = lambda x: x.count(delimiter)).count(delimiter)
df = pd.read_csv(path_name, header = None, skiprows = 1, names = list(range(0,max_columns)))

Set skiprows = 1 if there's actually a header, you can always retrieve the header column names later. You can also identify rows that have more columns populated than the number of column names in the original header.

Comments

0

A quick a dirty solution that may be helpful to people, you can copy and paste values of your data into a new excel file and save as csv. That can help remove some of those invisible funky characters from files sometimes.

Comments

0

for me there was a different number of columns in the file at a certain line. it helped passing names arguement when reading csv (which fixed the column count).
fix different column count in file

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.