2

everybody.

I can't find a pythonic way to ignore "blank" lines in a CSV. I use quotes because I'm talking about lines that look like '','','','','' Here is a CSV (blank lines could be random):

id,name,age
1,alex,22
3,tiff,42
,,
,,
4,john,24

Here is the code:

def getDataFromCsv(path):
    dataSet = []
    with open(unicode(path), 'r') as stream:
        reader = csv.reader(stream, delimiter=',')
        reader.next() # ignoring header
        for rowdata in reader:
            # how to check here?
            dataSet.append(rowdata)
    return dataSet

Here is similar questions that I've been reading, but different to this in particular: python csv reader ignore blank row

4
  • 2
    You can use if any(x for x in rowdata): dataSet.append(rowdata) Commented Jan 16, 2018 at 22:56
  • BTW, change to next(reader) to be compatible with Python3 also Commented Jan 16, 2018 at 23:01
  • together with schwobaseggl's @dekim solution works, exacly as expected. Thanks Commented Jan 17, 2018 at 1:41
  • @tonypdmtr good call, I was thinking only in 2.7.x, but I'll have that in mind for now on. Commented Jan 17, 2018 at 1:43

5 Answers 5

13

You can use any to check if any column in the row contains data:

for rowdata in reader:
    # how to check here?
    if any(x.strip() for x in rowdata):
        dataSet.append(rowdata)
Sign up to request clarification or add additional context in comments.

4 Comments

Works as expected, it's clear and it taught me something. Thanks!
Useful! If I understand correctly, this line if any(x.strip() for x in rowdata): can be read as: "if there are any values left after stripping all string values in the row, then the row has data and should be added to the dataSet'
For those trying to scrub out empty strings from rows is you used csv.DictReader: remember to specify that x is teh lsit of values from your row dictionary, i.e. use this line: if any(x.strip() for x in list(row.values())):
@grego "if there are any values left after stripping all string values in the row" -- that's almost correct: 1. ... any values left that aren't the empty string ... 2. any stops at the first truthy element, i.e. not necessarily all elements get stripped, but only as many as it takes.
0

Danger zone.. Maybe reviving an old thread..

Why not use a filter? Then there are no memory issues for large csv files, I think.

Something like:

for data in filter(any, reader):
    print(data)

Comments

-1

What about:

if len(rowdata) > 0:
    dataSet.append(rowdata)

Or am I missing a part of your question?

1 Comment

This won't work as these "empty" rows are still 3 long. You must test if all of the 3 strings in the row are empty.
-1

You can use the built-in function any:

for rowdata in reader:
    # how to check here?
    if not any(row):
        continue
    dataSet.append(rowdata)

Comments

-1
with open(fn, 'r') as csvfile:
    reader = csv.reader(csvfile)
    data = [row for row in reader if any(col for col in row)]
  • open CSV file
  • instantiate csv.reader() object
  • use a list comprehension to:
    • iterate over CSV rows
    • iterate over columns in the row
    • check if any column in the row has a value and if so, add to the list

2 Comments

Hi, welcome to Stack Overflow. When answering a question that already has many answers, please be sure to add some additional insight into why the response you're providing is substantive and not simply echoing what's already been vetted by the original poster. This is especially important in "code-only" answers such as the one you've provided.
While this code may answer the question, providing additional context regarding how and/or why it solves the problem would improve the answer's long-term value.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.