42

For example, my csv has columns as below:

ID, ID2, Date, Job No, Code

I need to write the columns back in the same order. The dict jumbles the order immediately, so I believe it's more of a problem with the reader.

6 Answers 6

66

Python's dicts do NOT maintain order prior to 3.6 (but, regardless, in that version the csv.DictReader class was modified to return OrderedDicts).

However, the instance of csv.DictReader that you're using (after you've read the first row!-) does have a .fieldnames list of strings, which IS in order.

So,

for rowdict in myReader:
  print ['%s:%s' % (f, rowdict[f]) for f in myReader.fieldnames]

will show you that the order is indeed maintained (in .fieldnames of course, NEVER in the dict -- that's intrinsically impossible in Python!-).

So, suppose you want to read a.csv and write b.csv with the same column order. Using plain reader and writer is too easy, so you want to use the Dict varieties instead;-). Well, one way is...:

import csv

a = open('a.csv', 'r')
b = open('b.csv', 'w')
ra = csv.DictReader(a)
wb = csv.DictWriter(b, None)

for d in ra:

  if wb.fieldnames is None:
    # initialize and write b's headers
    dh = dict((h, h) for h in ra.fieldnames)
    wb.fieldnames = ra.fieldnames
    wb.writerow(dh)

  wb.writerow(d)

b.close()
a.close()

assuming you have headers in a.csv (otherewise you can't use a DictReader on it) and want just the same headers in b.csv.

Sign up to request clarification or add additional context in comments.

4 Comments

Thanks Alex Martelli! You've gone over and above the call of duty here :) and... it is appreciated! I didn't realise there was fieldnames, but re-reading the API docs I can see it now. Thanks for the alternative, but since my DictReader is working well now I'll stick with it.
@RaffiKhatchadourian a is a.csv, b is b.csv ra presumably stands for read from file a, wb presumably for write to file b, d for dictionary, dh for dictionary headers.... Alex Thanks for the awesome explanation!
any reason why you went through the whole dh = dict(... ...wb.writerow(dh) when you could have done wb.fieldnames = ra.fieldnames; wb.writeheader()?
Looks good, but how you can add quotes to all fileds in all lines except the header?
10

Make an OrderedDict from each row dict sorted by DictReader.fieldnames.

import csv
from collections import OrderedDict

reader = csv.DictReader(open("file.csv"))
for row in reader:
    sorted_row = OrderedDict(sorted(row.items(),
          key=lambda item: reader.fieldnames.index(item[0])))

3 Comments

would you be able to explain how the lambda function here is working? I see similar examples listed in the official docs, however its not clear to me how this is producing the correct order (e.g. what the actual output of the lambda function looks like)
The last line serializes dict row into a (key,value) tuple array. Then resorts the array by the correct key order and repackages it on an OrderedDict. The lambda takes a tuple item, retrives it's key item[0] and looks for its index on reader.fieldnames. That index is used by sorted() to sort the tuples.
This is my favourite solution, also arrived at something like this after reading the OrderedDict docs. It has one obvious inefficiency which is that a sort is done for each row again even though the order will be the same for each row. I think however that is as good as it gets for Python 3.5, the price some of us who are forced to live in the past must pay. Probably only notice it if your csv was millions of rows long.
9
from csv import DictReader, DictWriter

with open("input.csv", 'r') as input_file:
    reader = DictReader(f=input_file)
    with open("output.csv", 'w') as output_file:
        writer = DictWriter(f=output_file, fieldnames=reader.fieldnames)
        for row in reader:
            writer.writerow(row)

Comments

2

I know this question is old...but if you use DictReader, you can pass it an ordered list with the fieldnames to the fieldnames param

Comments

2

Edit: as of python 3.6 dicts are ordered by insertion order, essentially making all dicts in python OrderedDicts by default. That being said the docs say dont rely on this behaviour because it may change. I will challenge that, lets see if it ever changes back :)


Unfortunatley the default DictReader does not allow for overriding the dict class, a custom DictReader would do the trick though

import csv

class DictReader(csv.DictReader):
    def __init__(self, *args, **kwargs):
        self.dict_class = kwargs.pop(dict_class, dict)
        super(DictReader, self).__init__(*args, **kwargs)

    def __next__(self):
        ''' copied from python source '''
        if self.line_num == 0:
            # Used only for its side effect.
            self.fieldnames
        row = next(self.reader)
        self.line_num = self.reader.line_num

        # unlike the basic reader, we prefer not to return blanks,
        # because we will typically wind up with a dict full of None
        # values
        while row == []:
            row = next(self.reader)
        # using the customized dict_class
        d = self.dict_class(zip(self.fieldnames, row))
        lf = len(self.fieldnames)
        lr = len(row)
        if lf < lr:
            d[self.restkey] = row[lf:]
        elif lf > lr:
            for key in self.fieldnames[lr:]:
                d[key] = self.restval
        return d

use it like so

import collections

csv_reader = DictReader(f, dict_class=collections.OrderedDict)
# ...

3 Comments

Note that from Python 3.6 onwards, the rows returned from csv.DictReaders are now OrderedDicts, not regular ones as they were previously (so doing something like this is no longer necessary). The good news, I guess, is that using this approach won't hurt, it'll only be redundant.
@martineau in Python 3.8, the rows returned from csv.DictReaders are changed to dict.
@lucemia: I know, that because dicts now maintain order, but still what I said before applies (i.e. that while using the approach in this answer will produce the correct results, it's superfluous).
0

I wrote a little tool to sort the order of CSV columns: I don't claim that it's great I know little of Python, but it does the job:

import csv
import sys

with open(sys.argv[1], 'r') as infile:
    csvReader = csv.DictReader(infile)
    sorted_fieldnames = sorted(csvReader.fieldnames)
    writer = csv.DictWriter(sys.stdout, fieldnames=sorted_fieldnames)
    # reorder the header first
    writer.writeheader()
    for row in csvReader:
        # writes the reordered rows to the new file
        writer.writerow(row)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.