4

I have a python script which takes the filename as a command argument and processes that file. However, i have thousands of files I need to process, and I would like to run the script on every file without having to add the filename as the argument each time.

The script works well when run on an individual file like this:

myscript.py /my/folder/of/stuff/text1.txt

I have this code to do them all at once, but it doesn't work

for fname in glob.iglob(os.path.join('folder/location')):
    proc = subprocess.Popen([sys.executable, 'script/location.py', fname])
    proc.wait()

Whenever I run the above code, it doesn't throw an error, but doesn't give me the intended output. I think the problem lies with the fact that the script is expecting the path to a .txt file as an argument, and the code is only giving it the folder that the file is sitting in (or at least not a working absolute reference).

How to correct this problem?

3
  • 1
    Why not edit myscript.py and split it up into functions? You then can do from myscript import my_function and call my_function on every file you need. Commented Jun 17, 2015 at 17:11
  • 1
    The os.path.join('folder/location') does nothing. Try os.path.join('folder/location', '*.txt') — one usually passes a file name pattern argument with wildcard characters in it to glob.iglob(). Commented Jun 17, 2015 at 17:21
  • related: Call python script with input with in a python script using subprocess and Python threading multiple bash subprocesses? Commented Jun 21, 2015 at 12:11

2 Answers 2

2

If the files are in the same folder and if the script supports it, you could use that syntax :

myscript.py /my/folder/of/stuff/*.txt

The wild card will be replaced by the corresponding files.

If the script doesn't support it, isolate the process like in this quick example :

import sys

def printFileName(filename):
  print filename

def main():
  args = sys.argv[1:]
  for filename in args:
    printFileName(filename)

if __name__ == '__main__':
  main()

Then from the console, you can start it like that :

python MyScript.py /home/andy/tmp/1/*.txt /home/andy/tmp/2/*.html

This will print the pathes of all the files in both folders.

Hope this can be of some help.

Sign up to request clarification or add additional context in comments.

1 Comment

you need to use glob() in Windows
0

You can write another script to do this. This is just a work around, try using os.walk

import sys, os
for root, dir, files in os.walk(PATH):
    for file in files:
        os.system ('myscript.py {}'.format(root + '\\' + file))

Provide the PATH of the whole folder to os.walk, it parses all the files in the directory.

If you want to parse specific files, say for example only files with .cppfiles, then you can filter the file names like this. add this after the for file in files

if file.endswith('.cpp'):

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.