Using CSV module to append multiple files while removing appended headers

Question

I would like to use the Python CSV module to open a CSV file for appending. Then, from a list of CSV files, I would like to read each csv file and write it to the appended CSV file. My script works great - except that I cannot find a way to remove the headers from all but the first CSV file being read. I am certain that my else block of code is not executing properly. Perhaps my syntax for my if else code is the problem? Any thoughts would be appreciated.

writeFile = open(append_file,'a+b')
writer = csv.writer(writeFile,dialect='excel')
    for files in lstFiles:
        readFile = open(input_file,'rU')
        reader = csv.reader(readFile,dialect='excel')
        for i in range(0,len(lstFiles)):
            if i == 0:
                oldHeader = readFile.readline() 
                newHeader = writeFile.write(oldHeader) 
                for row in reader: 
                    writer.writerow(row)
            else:
                reader.next()
                for row in reader:
                    row = readFile.readlines()
                    writer.writerow(row)
        readFile.close()
writeFile.close()

Peter DeGlopper · Accepted Answer · 2013-05-15 17:56:37Z

1

You're effectively iterating over lstFiles twice. For each file in your list, you're running your inner for loop up from 0. You want something like:

writeFile = open(append_file,'a+b')
writer = csv.writer(writeFile,dialect='excel')
headers_needed = True
for input_file in lstFiles:
    readFile = open(input_file,'rU')
    reader = csv.reader(readFile,dialect='excel')
    oldHeader = reader.next()
    if headers_needed:
        newHeader = writer.writerow(oldHeader)
        headers_needed = False 
    for row in reader:
        writer.writerow(row)
    readFile.close()
writeFile.close()

You could also use enumerate over the lstFiles to iterate over tuples containing the iteration count and the filename, but I think the boolean shows the logic more clearly.

You probably do not want to mix iterating over the csv reader and directly calling readline on the underlying file.

edited May 15, 2013 at 17:56

answered May 15, 2013 at 17:38

Peter DeGlopper

37.5k7 gold badges95 silver badges88 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

user2386858 Over a year ago

Thanks, Peter. I had a problem with the line assigning the oldHeader. I think maybe it works better with oldHeader = readFile.readline(). I used a combination of both your and Henry's answers. - Britta

Henry Keiter · Accepted Answer · 2013-05-15 19:43:36Z

0

I think you're iterating too many times (over various things: both your list of files and the files themselves). You've definitely got some consistency problems; it's a little hard to be sure since we can't see your variable initializations. This is what I think you want:

with open(append_file,'a+b') as writeFile:
    need_headers = True
    for input_file in lstFiles:
        with open(input_file,'rU') as readFile:
            headers = readFile.readline()
            if need_headers:
                # Write the headers only if we need them
                writeFile.write(headers)
                need_headers = False
            # Now write the rest of the input file.
            for line in readFile:
                writeFile.write(line)

I took out all the csv-specific stuff since there's no reason to use it for this operation. I also cleaned the code up considerably to make it easier to follow, using the files as context managers and a well-named boolean instead of the "magic" i == 0 check. The result is a much nicer block of code that (hopefully) won't have you jumping through hoops to understand what's going on.

edited May 15, 2013 at 19:43

answered May 15, 2013 at 17:43

Henry Keiter

17.3k8 gold badges53 silver badges85 bronze badges

2 Comments

user2386858 Over a year ago

Thanks Henry. I'm sticking with CSV module, but this worked great! You people really are amazing.

Henry Keiter Over a year ago

@user2386858 Glad I could help. The csv module is great, but there's no sense using it when you don't actually need to see the data (e.g. concatenating a bunch of files).

Collectives™ on Stack Overflow

Using CSV module to append multiple files while removing appended headers

2 Answers 2

1 Comment

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related