Combine csv in Python with skipping header row Error

Question

I've successfully combined all csv files in a directory, however struggling with the ability to skip the first row (header) of each file. The error I currently get is " 'list' object is not an iterator". I have tried multiple approaches including not using the [open(thefile).read()], but still not able to get it working. Here is my code:

 import glob
 files = glob.glob( '*.csv' )
 output="combined.csv"

 with open(output, 'w' ) as result:
     for thefile in files:
         f = [open(thefile).read()]
         next(f)   ## this line is causing the error 'list' object is not an iterator

         for line in f:
             result.write( line )
 message = 'file created'
 print (message)

You should close each file after reading it, either explicitly, or using 'with' as you did opening the file to which you are writing. — Fred Mitchell
– Fred Mitchell, Commented Mar 13, 2015 at 2:11

Avinash Raj · Accepted Answer · 2015-03-13 02:16:26Z

1

Use readlines() function instead of read(), so that you could easily skip the first line.

f = open(thefile)
m = f.readlines()
for line in m[1:]:
    result.write(line.rstrip())
f.close()

OR

with open(thefile) as f:
    m = f.readlines()
    for line in m[1:]:
        result.write(line.rstrip())

You don't need to explicitly close the file object if the file was opened through with statement.

edited Mar 13, 2015 at 2:16

answered Mar 13, 2015 at 2:09

Avinash Raj

175k32 gold badges247 silver badges289 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

jKraut Over a year ago

@ Avinash Raj it's telling me "invalid syntax" at m[1:]

Avinash Raj Over a year ago

did you put the colon after m[1:] ?

jKraut Over a year ago

when I use the exact example you gave it works, but does not keep same format as original files with each row of data on it's one line

Avinash Raj Over a year ago

try result.write(line)

mhawke · Accepted Answer · 2015-03-13 03:02:24Z

Here's an alternative using the oft forgotten fileinput.input() method:

import fileinput
from glob import glob

FILE_PATTERN = '*.csv'
output = 'combined.csv'

with open(output, 'w') as output:
    for line in fileinput.input(glob(FILE_PATTERN)):
        if not fileinput.isfirstline():
            output.write(line)

It's quite a bit cleaner than many other solutions.

Note that the code in your question was not far off working. You just need to change

f = [open(thefile).read()]

to

f = open(thefile)

but I suggest that using with would be better still because it will automatically close the input files:

with open(output, 'w' ) as result:
    for thefile in files:
        with open(thefile) as f:
            next(f)
            for line in f:
                result.write( line )

Fred Mitchell · Accepted Answer · 2015-03-13 02:16:52Z

0

>>> a = [1, 2, 3]
>>> next(a)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: list object is not an iterator

I am not sure why you chose to bracket the read, but you should recognize what is happening from the example above.

There is already a good answer. This is just an example of how you might look at the problem. Also, I would recommend getting what you want to work with just a single file. After that is working, import glob and work on using your mini-solution in the bigger problem.

answered Mar 13, 2015 at 2:16

Fred Mitchell

2,1712 gold badges21 silver badges29 bronze badges

Collectives™ on Stack Overflow

Combine csv in Python with skipping header row Error

3 Answers 3

4 Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

4 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related