How to search CSV line for string in certain column, print entire line to file if found

Question

Sorry, very much a beginner with Python and could really use some help.

I have a large CSV file, items separated by commas, that I'm trying to go through with Python. Here is an example of a line in the CSV.

123123,JOHN SMITH,SMITH FARMS,A,N,N,12345 123 AVE,CITY,NE,68355,US,12345 123 AVE,CITY,NE,68355,US,(123) 555-5555,(321) 555-5555,[email protected],15-JUL-16,11111,2013,22-DEC-93,NE,2,1\par

I'd like my code to scan each line and look at only the 9th item (the state). For every item that matches my query, I'd like that entire line to be written to an CSV.

The problem I have is that my code will find every occurrence of my query throughout the entire line, instead of just the 9th item. For example, if I scan looking for "NE", it will write the above line in my CSV, but also one that contains the string "NEARY ROAD."

Sorry if my terminology is off, again, I'm a beginner. Any help would be greatly appreciated.

I've listed my coding below:

import csv

with open('Sample.csv', 'rb') as f, open('NE_Sample.csv', 'wb') as outf:
    reader = csv.reader(f, delimiter=',')
    writer = csv.writer(outf)
    for line in f:
        if "NE" in line:
             print ('Found: []'.format(line))
             writer.writerow([line])

Mac · Accepted Answer · 2016-10-06 03:50:38Z

3

You're not actually using your reader to read the input CSV, you're just reading the raw lines from the file itself.

A fixed version looks like the following (untested):

import csv

with open('Sample.csv', 'rb') as f, open('NE_Sample.csv', 'wb') as outf:
    reader = csv.reader(f, delimiter=',')
    writer = csv.writer(outf)
    for row in reader:
        if row[8] == 'NE':
             print ('Found: {}'.format(row))
             writer.writerow(row)

The changes are as follows:

Instead of iterating over the input file's lines, we iterate over the rows parsed by the reader (each of which is a list of each of the values in the row).
We check to see if the 9th item in the row (i.e. row[8]) is equal to "NE".
If so, we output that row to the output file by passing it in, as-is, to the writer's writerow method.
I also fixed a typo in your print statement - the format method uses braces (not square brackets) to mark replacement locations.

edited Oct 6, 2016 at 3:50

answered Oct 6, 2016 at 3:43

Mac

14.8k11 gold badges65 silver badges83 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Mac Over a year ago

@J.Folkens: not a problem!

Zixian Cai · Accepted Answer · 2016-10-06 04:08:21Z

0

This snippet should solves your problem

import csv

with open('Sample.csv', 'rb') as f, open('NE_Sample.csv', 'wb') as outf:
    reader = csv.reader(f, delimiter=',')
    writer = csv.writer(outf)
    for row in reader:
        if "NE" in row:
            print ('Found: {}'.format(row))
            writer.writerow(row)

if "NE" in line in your code is trying to find out whether "NE" is a substring of string line, which works not as intended. The lines are raw lines of your input file.

If you use if "NE" in row: where row is parsed line of your input file, you are doing exact element matching.

answered Oct 6, 2016 at 4:08

Zixian Cai

9551 gold badge10 silver badges17 bronze badges

Collectives™ on Stack Overflow

How to search CSV line for string in certain column, print entire line to file if found

2 Answers 2

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related