3

I'd like to read the contents of every file in a folder/directory and then print them at the end (I eventually want to pick out bits and pieces from the individual files and put them in a separate document) So far I have this code

import os
path = 'results/'
fileList = os.listdir(path)
for i in fileList:
    file = open(os.path.join('results/'+ i), 'r')
allLines = file.readlines()
print(allLines)

at the end I dont get any errors but it only prints the contents of the last file in my folder in a series of strings and I want to make sure its reading every file so I can then access the data I want from each file. I've looked online and I cant find where I'm going wrong. Is there any way of making sure the loop is iterating over all my files and reading all of them?

also i get the same result when I use

 file = open(os.path.join('results/',i), 'r')

in the 5th line

Please help I'm so lost Thanks!!

1
  • when opening files, especially when opening multiple files, you should use a with statement Commented Nov 9, 2017 at 15:00

4 Answers 4

6
  • Separate the different functions of the thing you want to do.
  • Use generators wherever possible. Especially if there are a lot of files or large files

Imports

from pathlib import Path
import sys

Deciding which files to process:

source_dir = Path('results/')

files = source_dir.iterdir()

[Optional] Filter files

For example, if you only need files with extension .ext

files = source_dir.glob('*.ext')

Process files

def process_files(files):
    for file in files:
        with file.open('r') as file_handle :
            for line in file_handle:
                # do your thing
                yield line

Save the lines you want to keep

def save_lines(lines, output_file=sys.std_out):
    for line in lines:
        output_file.write(line)
Sign up to request clarification or add additional context in comments.

Comments

1

you forgot indentation at this line allLines = file.readlines() and maybe you can try that :

import os

allLines = []
path = 'results/'
fileList = os.listdir(path)
for file in fileList:
   file = open(os.path.join('results/'+ i), 'r')
   allLines.append(file.read())
print(allLines)

3 Comments

when I do that I get an error message UnicodeDecodeError: 'utf-8' codec can't decode byte 0x87 in position 27: invalid start byte
What is the type of file that you want to open ? and maybe you can try to decode it with 'latin-1' sometimes it works. keep me in touch
I get name 'i' is not defined error and do not see i is initialized, looks like this code needs an update?
1

You forgot to indent this line allLines.append(file.read()). Because it was outside the loop, it only appended the file variable to the list after the for loop was finished. So it only appended the last value of the file variable that remained after the loop. Also, you should not use readlines() in this way. Just use read() instead;

import os

allLines = []
path = 'results/'
fileList = os.listdir(path)
for file in fileList:
   file = open(os.path.join('results/'+ i), 'r')
   allLines.append(file.read())
print(allLines)

1 Comment

hey thanks for your reply! when I indent it like that I get an error message for line 9 and line 321 (which i dont understand) and it says: nicodeDecodeError: 'utf-8' codec can't decode byte 0x87 in position 27: invalid start byte
0

This also creates a file containing all the files you wanted to print.

rootdir= your folder, like 'C:\\Users\\you\\folder\\'
import os
f = open('final_file.txt', 'a')
for root, dirs, files in os.walk(rootdir):  
    for filename in files:
        data = open(full_name).read()
            f.write(data + "\n")                 
f.close()

This is a similar case, with more features: Copying selected lines from files in different directories to another file

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.