0

I am trying to convert 200 text files into csv files. I am using below code I am able to run it but it does not produce csv files. Could anyone tell any easy and fast way to do? Many Thanks

dirpath = 'C:\Files\Code\Analysis\Input\qobs_RR1\\'
output = 'C:\Files\Code\Analysis\output\qobs_CSV.csv'
csvout = pd.DataFrame()
files = os.listdir(dirpath)

for filename in files:
    data = pd.read_csv(filename, sep=':', index_col=0, header=None)
    csvout = csvout.append(data)

csvout.to_csv(output)
2
  • why dont u use python's open method, open the file and write it with a new extension. You can also use pathlib's with_suffix method to change the extension from txt to csv. I dont know why u are changing it, but try out my suggestions as they offer more granularity and would even be faster since it is just a change of the suffix. also, as an aside, in ur program, it does not like the code after the for loop is indented. Commented Mar 30, 2020 at 8:21
  • Do you need a single csv file of 200 separate csv files? Your code doesn't match your question title. Commented Mar 30, 2020 at 8:42

1 Answer 1

2

The problem is that your os.listdir gives you the list of filenames inside dirpath, not the full path to these files. You can get the full path by prepending the dirpath to filenames with os.path.join function.

import os
import pandas as pd

dirpath = 'C:\Files\Code\Analysis\Input\qobs_RR1\\'
output = 'C:\Files\Code\Analysis\output\qobs_CSV.csv'
csvout_lst = []
files = [os.path.join(dirpath, fname) for fname in os.listdir(dirpath)]

for filename in sorted(files):
    data = pd.read_csv(filename, sep=':', index_col=0, header=None)
    csvout_lst.append(data)

pd.concat(csvout_lst).to_csv(output)

Edit: this can be done with a one-liner:

pd.concat(
    pd.read_csv(os.path.join(dirpath, fname), sep=':', index_col=0, header=None)
    for fname in sorted(os.listdir(dirpath))
).to_csv(output)

Edit 2: updated the answer, so the list of files is sorted alphabetically.

Sign up to request clarification or add additional context in comments.

6 Comments

I'd just add here that using glob is a killer way to navigate in your directories and retrieve files. Especially here looking only at csvs: import glob; filenames = glob.glob(dirpath+"*.csv")
@Arnaud, agree - I didn't include it because from my observations people on Windows platform tend to avoid it in their code.
Right, I was telling myself the same thing.
@ taras Thanks it does work but in csvout_lst it produce csv randomly I actually want list of Dataframes in the same sequence as in Input folder so to recognize the specific file e.g I have text files with names 105, 120 ,144 etc I want same sequence in csvout_lst . is there any way?
@SumraMushtaq, sure - you just need to sort the list of files. See the updated answer
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.