1

The csv file has the following structure:

a,b,c
a,b,c,d,e,f,g
a,b,c,d
a,b,c

if I use file = pd.read_csv('Desktop/export.csv',delimiter=','), it will throw a tokenizing error like this: pandas.errors.ParserError: Error tokenizing data. C error: Expected 9 fields in line 3, saw 10

I do NOT want to skip bad lines. I want to read the csv with all columns and create a dataframe that looks like:

unnamed column1, unnamed column2, ....... unnamed column 7
a,b,c
a,b,c,d,e,f,g
a,b,c,d
a,b,c

How can I load the bad lines in the csv files?

2
  • Re stackoverflow.com/q/75242879 drop database `b'MavenFuzzyFactory'`;. Just enclose the identifier that has not normally allowed characters in backticks. Commented Jan 27, 2023 at 4:03
  • 1
    if that doesn't work, likely there are other characters you aren't seeing in the name; do select SCHEMA_NAME,hex(SCHEMA_NAME) from information_schema.SCHEMATA; to see what they might be. the name you report would just have 62274D6176656E46757A7A79466163746F727927 Commented Jan 27, 2023 at 5:34

1 Answer 1

0

You can use the error_bad_lines set to false.

import pandas as pd

file = pd.read_csv('Desktop/export.csv', delimiter=',',error_bad_lines=False)
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.