Python csv skip first two empty rows

Question

Before anyone marks this as duplicate, I have tried everything from isspace, startswith, itertools filterfunction, readlines()[2:]. I have a Python script that searches hundreds of CSV files and prints the row with the matching string (in this case a unique ID) in the eighth column from the left.

import csv
import glob

csvfiles = glob.glob('20??-??-??.csv')
for filename in csvfiles:
    reader = csv.reader(open(csvfiles))
    for row in reader:
        col8 = str(row[8])
        if col8 == '36862210':
            print row

The code works with test .csv files. However, the real .csv files I'm working with all have blank first two rows. And I am getting this error message.

IndexError: list index out of range

Here's my latest code:

import csv
import glob

csvfiles = glob.glob('20??-??-??.csv')
for filename in csvfiles:
    reader = csv.reader(open(csvfiles))
    for row in reader:
        if not row:
            continue
        col8 = str(row[8])
        if col8 == '36862210':
            print row

You might want to use row.strip() == '' to test an empty line rather than not row. — William
– William, Commented Aug 26, 2015 at 0:39
Do you want to skip the first two rows, regardless of their content? Or do you want to skip all empty rows, wherever they appear? — Robᵩ
– Robᵩ, Commented Aug 26, 2015 at 1:01
Just the first two rows...the batch of .csv just happens to have no data in the first two rows. Thanks. — adelaide01
– adelaide01, Commented Aug 26, 2015 at 2:16
When I use if row.strip() == ' ' the error message reads AttributeError: 'list' object has no attribute 'strip' — adelaide01
– adelaide01, Commented Aug 26, 2015 at 2:19

Gurupad Hegde · Accepted Answer · 2015-08-26 03:54:01Z

3

Try to skip the first two row using next instead:

import csv
import glob

csvfiles = glob.glob('20??-??-??.csv')
for filename in csvfiles:
    reader = csv.reader(open(filename))
    next(reader)
    next(reader)
    for row in reader:
        col8 = str(row[8])
        if col8 == '36862210':
            print row

edited Aug 26, 2015 at 3:54

Gurupad Hegde

2,15515 silver badges30 bronze badges

answered Aug 26, 2015 at 1:00

amirouche

7,9417 gold badges42 silver badges100 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

adelaide01 Over a year ago

I'm getting an error message. reader = csv.reader(open(csvfiles)). TypeError: coercing to Unicode: need string or buffer, list found

Gurupad Hegde · Accepted Answer · 2015-08-26 03:53:42Z

0

A csv reader takes an iterable, which can be a file object but need not be.

You can create a generator that removes all blank lines from a file like so:

csvfile = open(filename)
filtered_csv = (line for line in csvfile if not line.isspace())

This filtered_csv generator will lazily pull one line at a time from your file object, and skip to the next one if the line is entirely whitespace.

You should be able to write your code like:

for filename in csvfiles:
    csvfile = open(filename)
    filtered_csv = (line for line in csvfile if not line.isspace())
    reader = csv.reader(filtered_csv)
    for row in reader:
        col8 = str(row[8])
        if col8 == '36862210':
            print row

Assuming the non-blank rows are well formed, ie, all have an 8th index, you should not get an IndexError.

EDIT: If you're still encountering an IndexError it probably is not because of a line consisting of only whitespace. Catch the exception and look at the row:

try:
    col8 = str(row[8])
    if col8 == '36862210':
        print row
except IndexError:
    pass

to examine the output from the CSV reader that's actually causing the error. If the row is an object that doesn't print its contents, do instead print list(row).

edited Aug 26, 2015 at 3:53

Gurupad Hegde

2,15515 silver badges30 bronze badges

answered Aug 26, 2015 at 1:26

Matt Anderson

19.9k12 gold badges46 silver badges59 bronze badges

1 Comment

adelaide01 Over a year ago

My original script worked even if there were blank rows in the body of the .csv file. However, having the two blank rows at the top seems to be the problem. When I tried your script, I get this error. col8 = str(row[8]) IndexError: list index out of range

Collectives™ on Stack Overflow

Python csv skip first two empty rows

2 Answers 2

1 Comment

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related