opening and reading all the files in a directory in python - python beginner

Question

I'd like to read the contents of every file in a folder/directory and then print them at the end (I eventually want to pick out bits and pieces from the individual files and put them in a separate document) So far I have this code

import os
path = 'results/'
fileList = os.listdir(path)
for i in fileList:
    file = open(os.path.join('results/'+ i), 'r')
allLines = file.readlines()
print(allLines)

at the end I dont get any errors but it only prints the contents of the last file in my folder in a series of strings and I want to make sure its reading every file so I can then access the data I want from each file. I've looked online and I cant find where I'm going wrong. Is there any way of making sure the loop is iterating over all my files and reading all of them?

also i get the same result when I use

 file = open(os.path.join('results/',i), 'r')

in the 5th line

Please help I'm so lost Thanks!!

when opening files, especially when opening multiple files, you should use a with statement — Maarten Fabré
– Maarten Fabré, Commented Nov 9, 2017 at 15:00

diviquery · Accepted Answer · 2021-05-17 04:54:20Z

6

Separate the different functions of the thing you want to do.
Use generators wherever possible. Especially if there are a lot of files or large files

Imports

from pathlib import Path
import sys

Deciding which files to process:

source_dir = Path('results/')

files = source_dir.iterdir()

[Optional] Filter files

For example, if you only need files with extension .ext

files = source_dir.glob('*.ext')

Process files

def process_files(files):
    for file in files:
        with file.open('r') as file_handle :
            for line in file_handle:
                # do your thing
                yield line

Save the lines you want to keep

def save_lines(lines, output_file=sys.std_out):
    for line in lines:
        output_file.write(line)

edited May 17, 2021 at 4:54

diviquery

7797 silver badges24 bronze badges

answered Nov 9, 2017 at 14:34

Maarten Fabré

7,0781 gold badge19 silver badges37 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Rayane Bouslimi · Accepted Answer · 2017-11-09 14:15:52Z

1

you forgot indentation at this line allLines = file.readlines() and maybe you can try that :

import os

allLines = []
path = 'results/'
fileList = os.listdir(path)
for file in fileList:
   file = open(os.path.join('results/'+ i), 'r')
   allLines.append(file.read())
print(allLines)

answered Nov 9, 2017 at 14:15

Rayane Bouslimi

1931 gold badge1 silver badge9 bronze badges

3 Comments

Hannah Baker Over a year ago

when I do that I get an error message UnicodeDecodeError: 'utf-8' codec can't decode byte 0x87 in position 27: invalid start byte

Rayane Bouslimi Over a year ago

What is the type of file that you want to open ? and maybe you can try to decode it with 'latin-1' sometimes it works. keep me in touch

neelmeg Over a year ago

I get name 'i' is not defined error and do not see i is initialized, looks like this code needs an update?

Smich · Accepted Answer · 2017-11-09 16:54:38Z

1

You forgot to indent this line allLines.append(file.read()). Because it was outside the loop, it only appended the file variable to the list after the for loop was finished. So it only appended the last value of the file variable that remained after the loop. Also, you should not use readlines() in this way. Just use read() instead;

import os

allLines = []
path = 'results/'
fileList = os.listdir(path)
for file in fileList:
   file = open(os.path.join('results/'+ i), 'r')
   allLines.append(file.read())
print(allLines)

answered Nov 9, 2017 at 16:54

Smich

4551 gold badge8 silver badges18 bronze badges

1 Comment

Hannah Baker Over a year ago

hey thanks for your reply! when I indent it like that I get an error message for line 9 and line 321 (which i dont understand) and it says: nicodeDecodeError: 'utf-8' codec can't decode byte 0x87 in position 27: invalid start byte

Nicola Amadio · Accepted Answer · 2017-11-09 14:47:40Z

0

This also creates a file containing all the files you wanted to print.

rootdir= your folder, like 'C:\\Users\\you\\folder\\'
import os
f = open('final_file.txt', 'a')
for root, dirs, files in os.walk(rootdir):  
    for filename in files:
        data = open(full_name).read()
            f.write(data + "\n")                 
f.close()

This is a similar case, with more features: Copying selected lines from files in different directories to another file

answered Nov 9, 2017 at 14:47

Nicola Amadio

1693 silver badges11 bronze badges

Collectives™ on Stack Overflow

opening and reading all the files in a directory in python - python beginner

4 Answers 4

Imports

Deciding which files to process:

[Optional] Filter files

Process files

Save the lines you want to keep

Comments

3 Comments

1 Comment

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Imports

Deciding which files to process:

[Optional] Filter files

Process files

Save the lines you want to keep

Comments

3 Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related