Python- Import Multiple Files to a single .csv file

Question

I have 125 data files containing two columns and 21 rows of data and I'd like to import them into a single .csv file (as 125 pairs of columns and only 21 rows). This is what my data files look like:

enter image description here

I am fairly new to python but I have come up with the following code:

import glob
Results = glob.glob('./*.data')
fout='c:/Results/res.csv'
fout=open ("res.csv", 'w')
 for file in Results:
 g = open( file, "r" )
 fout.write(g.read())
 g.close() 
fout.close()

The problem with the above code is that all the data are copied into only two columns with 125*21 rows.

Any help is very much appreciated!

There is a Python Paste, but that's not what I'm talking about. — Ignacio Vazquez-Abrams
– Ignacio Vazquez-Abrams, Commented Apr 23, 2012 at 1:25
Is there a way to do it in python rather than the Python Paste, as I am fairly new to Python yet alone the Python Paste. — Esan
– Esan, Commented Apr 23, 2012 at 1:28

SudoNhim · Accepted Answer · 2012-12-20 06:57:10Z

1

This should work:

import glob

files = [open(f) for f in glob.glob('./*.data')] #Make list of open files
fout = open("res.csv", 'w')

for row in range(21):
    for f in files:
        fout.write( f.readline().strip() ) # strip removes trailing newline
        fout.write(',')
    fout.write('\n')

fout.close()

Note that this method will probably fail if you try a large number of files, I believe the default limit in Python is 256.

edited Dec 20, 2012 at 6:57

answered Apr 23, 2012 at 1:27

SudoNhim

5804 silver badges17 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

SudoNhim Over a year ago

Sorry, forgot to include the comma between concatenated lines. Should hopefully be good now

Esan Over a year ago

Thank you for the code, but there is a slight problem with the formatting as there are only 125 columns (i.e. the pair of columns are joined together when opened in excel)

SudoNhim Over a year ago

Sorry, I fixed that error about 1 minute after I posted it. Try re copy-pasting it if you haven't fixed it already :)

davesnitty · Accepted Answer · 2012-04-23 01:38:38Z

1

You may want to try the python CSV module (http://docs.python.org/library/csv.html), which provides very useful methods for reading and writing CSV files. Since you stated that you want only 21 rows with 250 columns of data, I would suggest creating 21 python lists as your rows and then appending data to each row as you loop through your files.

something like:

import csv

rows = []
for i in range(0,21):
    row  = []
    rows.append(row)

#not sure the structure of your input files or how they are delimited, but for each one, as you have it open and iterate through the rows, you would want to append the values in each row to the end of the corresponding list contained within the rows list.

#then, write each row to the new csv:

writer = csv.writer(open('output.csv', 'wb'), delimiter=',')
for row in rows:
    writer.writerow(row)

answered Apr 23, 2012 at 1:38

davesnitty

1,87016 silver badges11 bronze badges

1 Comment

Esan Over a year ago

Thank you for this.Please see the pic I now included in the question.

pepr · Accepted Answer · 2012-04-23 14:34:28Z

(Sorry, I cannot add comments, yet.)

[Edited later, the following statement is wrong!!!] "The davesnitty's generating the rows loop can be replaced by rows = [[]] * 21." It is wrong because this would create the list of empty lists, but the empty lists would be a single empty list shared by all elements of the outer list.

My +1 to using the standard csv module. But the file should be always closed -- especially when you open that much of them. Also, there is a bug. The row read from the file via the -- even though you only write the result here. The solution is actually missing. Basically, the row read from the file should be appended to the sublist related to the line number. The line number should be obtained via enumerate(reader) where reader is csv.reader(fin, ...).

[added later] Try the following code, fix the paths for your puprose:

import csv
import glob
import os

datapath = './data'
resultpath = './result'
if not os.path.isdir(resultpath):
   os.makedirs(resultpath)

# Initialize the empty rows. It does not check how many rows are
# in the file.
rows = []

# Read data from the files to the above matrix.
for fname in glob.glob(os.path.join(datapath, '*.data')):
    with open(fname, 'rb') as f:
        reader = csv.reader(f)
        for n, row in enumerate(reader):
            if len(rows) < n+1:
                rows.append([])  # add another row
            rows[n].extend(row)  # append the elements from the file

# Write the data from memory to the result file.
fname = os.path.join(resultpath, 'result.csv')
with open(fname, 'wb') as f:
    writer = csv.writer(f)
    for row in rows:
        writer.writerow(row)

Collectives™ on Stack Overflow

Python- Import Multiple Files to a single .csv file

3 Answers 3

3 Comments

1 Comment

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

3 Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related