0

Please see the question below the code -

import csv

MY_FILE = "../data/sample_sfpd_incident_all.csv"

def parse(raw_file, delimiter):
    opened_file = open(raw_file)
    csv_data = csv.reader(opened_file, delimiter=delimiter)

    parsed_data = []
    fields = csv_data.next()
    for row in csv_data:
        parsed_data.append(dict(zip(fields, row)))
    opened_file.close()
    return parsed_data

def main():
    new_data = parse(MY_FILE, ",")
    print new_data

if __name__ == "__main__":
    main()

The question is, if I want to save this parsed data to another file in csv format, how do I do it? I tried a few things, but it kinda screwed up the code at one point.

Can anyone help?

4
  • Why did you not use csv.DictReader() instead? It'd have handled the fields and dictionary creation automatically for you. Commented Nov 19, 2013 at 10:53
  • Do you have to retain the dictionaries-as-rows here? Otherwise you either have to specify a field order manually, or sort the field names (which may be a different order from the input), or return the fieldnames as well as the data. Commented Nov 19, 2013 at 10:55
  • I basically have to copy the data exactly as is into another file. csv format. Commented Nov 19, 2013 at 11:05
  • Then I'd drop the dict(zip(fields, row)) and go straight for the list rows. Commented Nov 19, 2013 at 11:07

1 Answer 1

1

You can use csv.writer() to write rows of data to another CSV file, or use csv.DictWriter() to write dictionaries with row data. The latter does require that you specify fieldnames up front.

Copying across CSV rows could best be done straight when reading:

with open(inputfilename, 'rb') as ifh, open(outputfilename, 'wb') as ofh:
    reader = csv.reader(ifh)
    writer = csv.reader(ifh)

    writer.writerows(reader)

Here writer.writerows() takes an iterable of rows; a reader object happens to be an iterable, so this copies data straight across. You can add additional dialect parameters t othe csv.reader() and csv.writer() constructors to change the format; say you read with delimiter='\t' (tab separated) and write out the default comma-separated format.

If you want to process each row in between, use a loop:

for row in reader:
    # do something with row
    writer.writerow(row)

The writer.writerow() (singular) method takes one row at a time to write out.

To copy across dictionary rows, you need to specify the fields. Use a csv.DictReader() object to produce the dictionaries, it'll handle fieldnames from the first CSV row for you:

with open(inputfilename, 'rb') as ifh, open(outputfilename, 'wb') as ofh:
    reader = csv.DictReader(ifh)
    writer = csv.DictWriter(ifh, fieldnames=reader.fieldnames)

    writer.writerows(reader)

Here, the fieldnames parameter is taken straight from the csv.DictReader().fieldnames attribute. Again, process rows in a loop as required, same as with a regular csv.reader() / csv.writer() pair.

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you for such detailed answer. I am gonna implement this and come back later.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.