0

I've successfully combined all csv files in a directory, however struggling with the ability to skip the first row (header) of each file. The error I currently get is " 'list' object is not an iterator". I have tried multiple approaches including not using the [open(thefile).read()], but still not able to get it working. Here is my code:

 import glob
 files = glob.glob( '*.csv' )
 output="combined.csv"

 with open(output, 'w' ) as result:
     for thefile in files:
         f = [open(thefile).read()]
         next(f)   ## this line is causing the error 'list' object is not an iterator

         for line in f:
             result.write( line )
 message = 'file created'
 print (message)  
2
  • You should close each file after reading it, either explicitly, or using 'with' as you did opening the file to which you are writing. Commented Mar 13, 2015 at 2:11
  • You might find this answer helpful. Commented Mar 13, 2015 at 2:16

3 Answers 3

1

Use readlines() function instead of read(), so that you could easily skip the first line.

f = open(thefile)
m = f.readlines()
for line in m[1:]:
    result.write(line.rstrip())
f.close()

OR

with open(thefile) as f:
    m = f.readlines()
    for line in m[1:]:
        result.write(line.rstrip())

You don't need to explicitly close the file object if the file was opened through with statement.

Sign up to request clarification or add additional context in comments.

4 Comments

@ Avinash Raj it's telling me "invalid syntax" at m[1:]
did you put the colon after m[1:] ?
when I use the exact example you gave it works, but does not keep same format as original files with each row of data on it's one line
try result.write(line)
1

Here's an alternative using the oft forgotten fileinput.input() method:

import fileinput
from glob import glob

FILE_PATTERN = '*.csv'
output = 'combined.csv'

with open(output, 'w') as output:
    for line in fileinput.input(glob(FILE_PATTERN)):
        if not fileinput.isfirstline():
            output.write(line)

It's quite a bit cleaner than many other solutions.

Note that the code in your question was not far off working. You just need to change

f = [open(thefile).read()]

to

f = open(thefile)

but I suggest that using with would be better still because it will automatically close the input files:

with open(output, 'w' ) as result:
    for thefile in files:
        with open(thefile) as f:
            next(f)
            for line in f:
                result.write( line )

Comments

0
>>> a = [1, 2, 3]
>>> next(a)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: list object is not an iterator

I am not sure why you chose to bracket the read, but you should recognize what is happening from the example above.

There is already a good answer. This is just an example of how you might look at the problem. Also, I would recommend getting what you want to work with just a single file. After that is working, import glob and work on using your mini-solution in the bigger problem.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.