1

I have a csv file which has no header columns and it has variable length records in each line.

Each record can go upto 398 fields and I want to keep only 256 fields in my dataframe.As I need only those fields to process.

Below is a slim version of the file.

1,2,3,4,5,6
12,34,45,65
34,34,24

In the above I would like to keep only 3 fields(analogous to 256 above) from each line while calling the read_csv.

I tried the below

import pandas as pd
df = pd.read_csv('sample.csv',header=None)

I get the following error as pandas taking the 1st to generate the metadata.

  File "pandas/_libs/parsers.pyx", line 2042, in pandas._libs.parsers.raise_parser_error
pandas.errors.ParserError: Error tokenizing data. C error: Expected 5 fields in line 4, saw 10

Only solution I can think of is using

names = ['column1','column2','column3','column4','column5','column6']

while creating the data frame.

But for the real files which can be upto 50MB I don't want to do that as that is taking a lot of memory and I am trying to run it using aws lambda which will incur more cost. I have to process a large number of files daily.

My question is can I just create a dataframe using the slimmer 256 field while reading the csv alone? Can that be my step one ?

I am very new to pandas so kindly bear my ignorance. I tried to look for a solution for a long time but could find one.

2
  • 1
    try using usecols (read more in the docs)... but since csv's are just text files pandas still has to load and read the full file to identify the columns, usecols just controls what is parsed into the dataframe Commented Sep 2, 2020 at 20:41
  • 1
    consider using .to_hdf for quick columnar access with a binary file Commented Sep 2, 2020 at 20:43

1 Answer 1

1
# only 3 columns
df = pd.read_csv('sample.csv', header=None, usecols=range(3))
print(df)
#     0   1   2
# 0   1   2   3
# 1  12  34  45
# 2  34  34  24

So just change range value.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.