4

With the below code I receive IOError: [Errno 13] Permission denied, and I know this is due to the output directory being a sub-folder of the input directory:

import datetime
import os

inputdir = "C:\\temp2\\CSV\\"
outputdir = "C:\\temp2\\CSV\\output\\"
keyword = "KEYWORD"

for path, dirs, files in os.walk(os.path.abspath(inputdir)):
    for f in os.listdir(inputdir):
        file_path = os.path.join(inputdir, f)
        out_file = os.path.join(outputdir, f)
        with open(file_path, "r") as fh, open(out_file, "w") as fo:
            for line in fh:
                if keyword not in line:
                    fo.write(line)

However, when I change the output folder to: outputdir = "C:\\temp2\\output\\" the code runs successfully. I want to be able to write the modified files to a sub-folder of the input directory. How would I do this without getting the 'Permission denied' error? Would the tempfile module be useful in this scenario?

5
  • Do you have permissions to write there? I would use tempfile, personally, as it's cleaner. Commented Mar 28, 2012 at 4:06
  • could you make fh as writable as well? Commented Mar 28, 2012 at 4:06
  • As a sidenote, if you use raw strings then the double-backslashes won't be necessary. Example: r"C:\Temp1\CSV\Output\" vs. "C:\\Temp1\CSV\\Output\\". Commented Mar 28, 2012 at 5:30
  • Thanks for the tip, Li-aung. =) Commented Mar 28, 2012 at 5:44
  • First you should understand how walk works. Commented Jun 20, 2013 at 4:08

2 Answers 2

1

os.listdir will return directory as well as file names. output is within inputdir so the with is trying to open a directory for reading/writing.

What exactly are you trying to do? path, dirs, files aren't even being used in the recursive os.walk.

Edit: I think you're looking for something like this:

import os

INPUTDIR= "c:\\temp2\\CSV"
OUTPUTDIR = "c:\\temp2\\CSV\\output"
keyword = "KEYWORD"

def make_path(p):
    '''Makes sure directory components of p exist.'''
    try:
        os.makedirs(p)
    except OSError:
        pass

def dest_path(p):
    '''Determines relative path of p to INPUTDIR,
       and generates a matching path on OUTPUTDIR.
    '''
    path = os.path.relpath(p,INPUTDIR)
    return os.path.join(OUTPUTDIR,path)

make_path(OUTPUTDIR)

for path, dirs, files in os.walk(INPUTDIR):
    for d in dirs:
        dir_path = os.path.join(path,d)
        # Handle case of OUTPUTDIR inside INPUTDIR
        if dir_path == OUTPUTDIR:
            dirs.remove(d)
            continue
        make_path(dest_path(dir_path))    
    for f in files:
        file_path = os.path.join(path, f)
        out_path = dest_path(file_path)
        with open(file_path, "r") as fh, open(out_path, "w") as fo:
            for line in fh:
                if keyword not in line:
                    fo.write(line)
Sign up to request clarification or add additional context in comments.

6 Comments

Sorry, I'm still fairly new to Python. Currently I just have a flat directory of CSV files that I want to process -- delete lines and save the modified CSV files to an output folder within the input folder. However, there might be a time where I have folders/sub-folders of CSV files that I want to process, so I want to be sure I can handle that scenario too.
It complicates matters to recurse over a directory while modifying it at the same time. It can be done by modifying dirs and removing output directory when it is seen, but simpler to keep the output directory outside the input directory. You don't actually need os.listdir. for f in files along with file_path = os.path.join(path,f) will give you the full path of every file, recursively, under the input dir.
Thanks for your feedback, Mark. I will look into your suggestions. I agree that it's simpler to keep the output dir outside the input dir, however, other people on my team who will be using this tool that I'm creating would prefer to have the output dir within the input dir. I've already accomplished this using PowerShell (which doesn't seem to have these complications), but I want to take advantage of Python's text processing power. In some cases I will have millions of lines in CSV files to process and want a more powerful tool to handle the job.
I updated my answer with something that would work the way you want. Re: PowerShell, did that version recurse? Comment out the # Handle case... portion above and you'll see how it complicates things.
Hi Mark, I tested your above code and it works great! Thank you. And RE: PowerShell - yes, it does recurse. =)
|
1

If you are successful in writing to a output directory outside of the input traversing directory, then write it there first using the same code as above and then move it to a sub-directory within the input directory. You could use os.move for that.

2 Comments

Hi Senthil, I actually thought about using os.move but wasn't sure if that approach would be less efficient. I could give it a try...
Hi Keith, that's not less efficiently, it's a good approach. In fact, writing to a location outside of the your reading location would be better. And then you could atomically move to the final destination.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.