0

I would like to use the Python CSV module to open a CSV file for appending. Then, from a list of CSV files, I would like to read each csv file and write it to the appended CSV file. My script works great - except that I cannot find a way to remove the headers from all but the first CSV file being read. I am certain that my else block of code is not executing properly. Perhaps my syntax for my if else code is the problem? Any thoughts would be appreciated.

writeFile = open(append_file,'a+b')
writer = csv.writer(writeFile,dialect='excel')
    for files in lstFiles:
        readFile = open(input_file,'rU')
        reader = csv.reader(readFile,dialect='excel')
        for i in range(0,len(lstFiles)):
            if i == 0:
                oldHeader = readFile.readline() 
                newHeader = writeFile.write(oldHeader) 
                for row in reader: 
                    writer.writerow(row)
            else:
                reader.next()
                for row in reader:
                    row = readFile.readlines()
                    writer.writerow(row)
        readFile.close()
writeFile.close() 

2 Answers 2

1

You're effectively iterating over lstFiles twice. For each file in your list, you're running your inner for loop up from 0. You want something like:

writeFile = open(append_file,'a+b')
writer = csv.writer(writeFile,dialect='excel')
headers_needed = True
for input_file in lstFiles:
    readFile = open(input_file,'rU')
    reader = csv.reader(readFile,dialect='excel')
    oldHeader = reader.next()
    if headers_needed:
        newHeader = writer.writerow(oldHeader)
        headers_needed = False 
    for row in reader:
        writer.writerow(row)
    readFile.close()
writeFile.close()

You could also use enumerate over the lstFiles to iterate over tuples containing the iteration count and the filename, but I think the boolean shows the logic more clearly.

You probably do not want to mix iterating over the csv reader and directly calling readline on the underlying file.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks, Peter. I had a problem with the line assigning the oldHeader. I think maybe it works better with oldHeader = readFile.readline(). I used a combination of both your and Henry's answers. - Britta
0

I think you're iterating too many times (over various things: both your list of files and the files themselves). You've definitely got some consistency problems; it's a little hard to be sure since we can't see your variable initializations. This is what I think you want:

with open(append_file,'a+b') as writeFile:
    need_headers = True
    for input_file in lstFiles:
        with open(input_file,'rU') as readFile:
            headers = readFile.readline()
            if need_headers:
                # Write the headers only if we need them
                writeFile.write(headers)
                need_headers = False
            # Now write the rest of the input file.
            for line in readFile:
                writeFile.write(line)

I took out all the csv-specific stuff since there's no reason to use it for this operation. I also cleaned the code up considerably to make it easier to follow, using the files as context managers and a well-named boolean instead of the "magic" i == 0 check. The result is a much nicer block of code that (hopefully) won't have you jumping through hoops to understand what's going on.

2 Comments

Thanks Henry. I'm sticking with CSV module, but this worked great! You people really are amazing.
@user2386858 Glad I could help. The csv module is great, but there's no sense using it when you don't actually need to see the data (e.g. concatenating a bunch of files).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.