How to sort a file alphabetically by named column, python, csv

Question

I have three csv files each with three named columns, 'Genus', 'Species', and 'Source'. I merged the files into a new document and now I need to alphabetize the columns, first by genus and then by species. I figured I could do this by first alphabetizing the species, and then the genus and then they should be in the proper order, but I haven't been able to find anything online that addresses how to sort named columns of strings. I tried lots of different ways of sorting, but it either didn't change anything or replaced all the string in the first column with the last string.

Here's my code for merging the files:

import csv, sys

with open('Footit_aphid_list_mod.csv', 'r') as inny:
    reader = csv.DictReader(inny)

    with open('Favret_aphid_list_mod.csv', 'r') as inny:
        reader1 = csv.DictReader(inny)

        with open ('output_al_vonDohlen.csv', 'r') as inny:
            reader2 = csv.DictReader(inny)

            with open('aphid_list_complete.csv', 'w') as outty:
                fieldnames = ['Genus', 'Species', 'Source']
                writer = csv.DictWriter(outty, fieldnames = fieldnames)
                writer.writeheader() 

                for record in reader:
                    writer.writerow(record)
                for record in reader1:
                    writer.writerow(record)
                for record in reader2:
                    writer.writerow(record)

                for record in reader:
                    g = record['Genus']
                    g = sorted(g)
                    writer.writerow(record)

inny.closed
outty.closed

first store all the data in a list of rows then sort, then write back to file. — Jean-François Fabre
– Jean-François Fabre ♦, Commented Nov 21, 2017 at 21:35
you may find this page useful: stackoverflow.com/questions/4233476/… — pault
– pault, Commented Nov 21, 2017 at 21:39

Mark Tolonen · Accepted Answer · 2017-11-22 14:55:35Z

2

If you files aren't insanely large, then read all the rows into a single list, sort it, then write it back:

#!python2
import csv

rows = []

with open('Footit_aphid_list_mod.csv','rb') as inny:
    reader = csv.DictReader(inny)
    rows.extend(reader)

with open('Favret_aphid_list_mod.csv','rb') as inny:
    reader = csv.DictReader(inny)
    rows.extend(reader)

with open('output_al_vonDohlen.csv','rb') as inny:
    reader = csv.DictReader(inny)
    rows.extend(reader)

rows.sort(key=lambda d: (d['Genus'],d['Species']))

with open('aphid_list_complete.csv','wb') as outty:
    fieldnames = ['Genus','Species','Source']
    writer = csv.DictWriter(outty,fieldnames=fieldnames)
    writer.writeheader()
    writer.writerows(rows)

edited Nov 22, 2017 at 14:55

answered Nov 22, 2017 at 7:38

Mark Tolonen

181k26 gold badges183 silver badges279 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

birdoptera Over a year ago

This worked! The only thing is that because I'm using 2.7, I had to remove all the 'newline=' attributes from 'open'- but everything was just fine without them.

Mark Tolonen Over a year ago

@birdoptera Updated. Note use of binary mode instead of newline='' for Python 2 per csv documentation.

Collectives™ on Stack Overflow

How to sort a file alphabetically by named column, python, csv

1 Answer 1

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related