0

may be the answer of this question is available but I could not get proper solution and thus I am looking for the perfect solution. Suppose I have multiple CSV files (around 1500) having single column with some time series data (10,000 times or rows). The column header name is same in all CSV files. Suppose I have CSV files like:

aa1.csv      aa2.csv:      aa3.csv:............aa1500.csv:
datavalue   datavalue      datavalue           datavalue
    4            1             1                  2
    2            3             6                  4
    3            3             3                  8                
    4            4             8                  9


I want the output like this:


datavalue,datavalue,datavalue,datavalue,.....datavalue
4,1,1,..2
2,3,6,..4
3,3,3,..8
4,4,8,..9

My codes are not working and giving something else:

import pandas as pd
import csv
import glob
import os
path 'F:/Work/'
files_in_dir = [f for f in os.listdir(path) if f.endswith('csv')]
for filenames in files_in_dir:
    df = pd.read_csv(filenames)
    df.to_csv('out.csv', mode='a')

If someone can help in this?

2
  • Do every such CSV files contain the same number of rows? Commented Jun 20, 2018 at 6:00
  • yes, every csv file has same number of rows Commented Jun 20, 2018 at 6:03

3 Answers 3

3

You can try it the following way with a little help from numpy

import pandas as pd
import numpy as np
import os
path 'F:/Work/'
files_in_dir = [f for f in os.listdir(path) if f.endswith('csv')]
temp_data = []
for filenames in files_in_dir:
    temp_data.append(np.loadtxt(filenames,dtype='str'))

temp_data = np.array(temp_data)
np.savetxt('out.csv',temp_data.transpose(),fmt='%s',delimiter=',')
Sign up to request clarification or add additional context in comments.

2 Comments

It is dumping the data but appended values are not same
The values in the list are different from what is being dumped.? That is not possible
2

Use pandas concat function

import pandas as pd
dfs = []
for filenum in range(1,1501):
    dfs.append( pd.read_csv('aa{}.csv'.format(filenum)) )
print(pd.concat(dfs,axis=1).to_csv(index=False))

Comments

1

One of the ways to achieve this is by creating another CSV file by merging data from existing CSV files (assuming you have the CSV files in the format aa##.csv)...

contents = []

for filenum in range(2):
    f = open('aa{}.csv'.format(filenum + 1), 'r')
    lines = f.readlines()
    print(lines)
    f.close()

    if contents == []:
        contents = [[] for a in range(len(lines))]

    for row in range(len(lines)):
        contents[row].append(lines[row].rstrip('\n'))
        print(lines[row])

print(contents)
f = open('aa_new.csv', 'w')

for row in range(len(contents)):
    line = str(contents[row])
    line = line.strip('[]')
    f.write(line + '\n')

f.close()

You can then open & display this file as you wish using pandas.

5 Comments

after using line.strip().splitcan getting another error - only concatenate list (not "str") to list
getting this '22.283026666666643\n' '21.499415555555537\n' '20.142722222222197\n' '0.0\n' '13.923109213483146\n' '9.08471160493827\n' '12.864911460674154\n' '2.9649911054637865\n'
where i have null values it is giving "'0.0\n'" and where i have values it is giving "'8.649526419753085\n'"
Thank you @ Melvin
Anytime @VishalSingh

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.