4

If I'm using CSV.dictReader to read in an CSV, how would I go about having it ignore certain columns in the CSV?

For example,

"id","name","address","number","created"
"123456","someName","someAddress","someNumber","2003-5-0294"

And I want to just get the id and name using the reader, discarding and ignoring the rest. I tried using fieldnames but it still reads it in and sets it as "None". I noticed that the csv.DictWriter has an 'ignore' function but it seems the DictReader does not. Was hoping there was a more elegant way to do this versus just reading and then writing only the columns I want to another CSV and then reading that CSV using DictReader to do further processing.

Thanks guys!

4 Answers 4

6

Read in each row, then create a list of dicts with just the keys you want.

[{'id':r['id'], 'name':r['name']} for r in mydictreader]
Sign up to request clarification or add additional context in comments.

Comments

5

This simple generator will do it.

def dict_filter(it, *keys):
    for d in it:
        yield dict((k, d[k]) for k in keys)

Use it like this:

dreader = [{'id':1, 'name':'Bob', 'other_stuff':'xy'},
           {'id':2, 'name':'Jen', 'other_stuff':'xx'}]

for d in dict_filter(dreader, 'id', 'name'):
    print d

gives:

{'id': 1, 'name': 'Bob'}
{'id': 2, 'name': 'Jen'}

Comments

5

The other posted solutions build new smaller dicts from the larger fully populated dicts returned by DictReader.

Something like this will be necessary because the DictReader API was intentionally designed not to skip fields. Here is an excerpt from the source:

    # unlike the basic reader, we prefer not to return blanks,
    # because we will typically wind up with a dict full of None
    # values
    while row == []:
        row = self.reader.next()
    d = dict(zip(self.fieldnames, row))

You can see that every fieldname gets assigned to the dictionary without filtering.

FWIW, it is not hard make your own variant of DictReader will the desired behavior. Model it after the existing CSV source.

Comments

2
from operator import itemgetter

cols=('name', 'id') #Tuple of keys you want to keep
valuesfor=itemgetter(*cols)

for d in dictreader_input:
    print dict(zip(cols, valuesfor(d))) # dict from zipping cols and values

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.