0

I'm trying to find a list of files in a directory tree. In essence I provide a text file with all the terms I want to search for (~500) and have it look for them in a directory and subdirectories. However, I'm having problems with - I believe - the steps that the code takes and ends prematurely without searching in all folders.

The code I'm using is (pattern is the name of a text file):

import os

def locateA(pattern, root):
    file  = open(pattern, 'r')
    for path, dirs, files in os.walk(root):
        for word in files:
            for line in file:
                if line.strip() in word:
                    print os.path.join(path, word), line.strip()

Any ideas on where I'm mistaken?

3
  • 1
    I suggest to use the construct with open(pattern, 'rU') as f: and don't call your file file because file is a class in the builtin module. Commented Mar 14, 2012 at 17:04
  • Changed the name of file to something else. I'll investigate the construt you mentioned. Commented Mar 14, 2012 at 17:13
  • So what exactly are the symptoms of the problem? Commented Mar 14, 2012 at 20:47

2 Answers 2

1

All or part of the problem may be that you can only iterate through a file once unless you use file.seek() to reset the current position in the file.

Make sure you seek back to the beginning of the file before attempting to loop through it again:

import os

def locateA(pattern, root):
    file  = open(pattern, 'r')
    for path, dirs, files in os.walk(root):
        for word in files:
            file.seek(0)             # this line is new
            for line in file:
                if line.strip() in word:
                    print os.path.join(path, word), line.strip()
Sign up to request clarification or add additional context in comments.

1 Comment

Ah ha!, seems like this is working. Didn't know that you can only iterate once
0

for line in file consumes the lines in file the first time and then is empty every time after that.

Try this instead, which fixes that and some other problems:

import os

def locateA(pattern, root):
    patterns = open(pattern, 'r').readlines() # patterns is now an array, no need to reread every time.
    for path, dirs, files in os.walk(root):
        for filename in files:
            for pattern in patterns:
                if pattern.strip() in filename:
                    print os.path.join(path, filename), pattern.strip()

2 Comments

Quick question, why do i need filecontent = open(file,'r').read() in the code? Does this open everyfile in the directory?
Sorry about that, I misread your question and thought you wanted to execute the equivalent of grep in each file. I now see you're actually matching the filenames. I corrected the example.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.