How to merge multiple text files into one csv file in Python

Question

I am trying to convert 200 text files into csv files. I am using below code I am able to run it but it does not produce csv files. Could anyone tell any easy and fast way to do? Many Thanks

dirpath = 'C:\Files\Code\Analysis\Input\qobs_RR1\\'
output = 'C:\Files\Code\Analysis\output\qobs_CSV.csv'
csvout = pd.DataFrame()
files = os.listdir(dirpath)

for filename in files:
    data = pd.read_csv(filename, sep=':', index_col=0, header=None)
    csvout = csvout.append(data)

csvout.to_csv(output)

why dont u use python's open method, open the file and write it with a new extension. You can also use pathlib's with_suffix method to change the extension from txt to csv. I dont know why u are changing it, but try out my suggestions as they offer more granularity and would even be faster since it is just a change of the suffix. also, as an aside, in ur program, it does not like the code after the for loop is indented. — sammywemmy
– sammywemmy, Commented Mar 30, 2020 at 8:21
Do you need a single csv file of 200 separate csv files? Your code doesn't match your question title. — taras
– taras, Commented Mar 30, 2020 at 8:42

taras · Accepted Answer · 2020-03-31 07:44:45Z

2

The problem is that your os.listdir gives you the list of filenames inside dirpath, not the full path to these files. You can get the full path by prepending the dirpath to filenames with os.path.join function.

import os
import pandas as pd

dirpath = 'C:\Files\Code\Analysis\Input\qobs_RR1\\'
output = 'C:\Files\Code\Analysis\output\qobs_CSV.csv'
csvout_lst = []
files = [os.path.join(dirpath, fname) for fname in os.listdir(dirpath)]

for filename in sorted(files):
    data = pd.read_csv(filename, sep=':', index_col=0, header=None)
    csvout_lst.append(data)

pd.concat(csvout_lst).to_csv(output)

Edit: this can be done with a one-liner:

pd.concat(
    pd.read_csv(os.path.join(dirpath, fname), sep=':', index_col=0, header=None)
    for fname in sorted(os.listdir(dirpath))
).to_csv(output)

Edit 2: updated the answer, so the list of files is sorted alphabetically.

edited Mar 31, 2020 at 7:44

answered Mar 30, 2020 at 8:34

taras

6,93510 gold badges46 silver badges54 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

arnaud Over a year ago

I'd just add here that using glob is a killer way to navigate in your directories and retrieve files. Especially here looking only at csvs: import glob; filenames = glob.glob(dirpath+"*.csv")

taras Over a year ago

@Arnaud, agree - I didn't include it because from my observations people on Windows platform tend to avoid it in their code.

arnaud Over a year ago

Right, I was telling myself the same thing.

Sam Over a year ago

@ taras Thanks it does work but in csvout_lst it produce csv randomly I actually want list of Dataframes in the same sequence as in Input folder so to recognize the specific file e.g I have text files with names 105, 120 ,144 etc I want same sequence in csvout_lst . is there any way?

taras Over a year ago

@SumraMushtaq, sure - you just need to sort the list of files. See the updated answer

|

Collectives™ on Stack Overflow

How to merge multiple text files into one csv file in Python

1 Answer 1

6 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

6 Comments

Your Answer

Sign up or log in

Post as a guest

Related