How to resume from previous failed iteration while traversing through a big nested directory in Python?

I am currently using os.walk to navigate through all subfolders and files in a massive Network drive directory, However, Whenever my VPN disconnects, The for loop fails. Next day when I re-run my code, I would like to resume from the last file that was processed. What modifications should I make in my code below?

import os

directory = '//DirectoryName/FolderName'

for root, dirs, files in os.walk((os.path.normpath(directory)), topdown=False):
  for name in files:
        Source_File = os.path.join(root,name)
        #This loads the file to s3 bucket
        s3_client.upload_file(Source_File, bucket, Target_File)

The directory is really massive, Has hundreds of sub-folders, and thousands of files in total.

edited Sep 16, 2022 at 17:56

asked Sep 16, 2022 at 17:54

Devansh Popat

134 bronze badges

Keep track of the files you already processed in a separate file

rdas
– rdas

2022-09-16 17:56:44 +00:00
Commented Sep 16, 2022 at 17:56
Are you sure, what you do is legal?

treuss
– treuss

2022-09-16 17:57:47 +00:00
Commented Sep 16, 2022 at 17:57
@treuss, What do you mean? I am doing this work as a part of my job.

Devansh Popat
– Devansh Popat

2022-09-16 18:40:06 +00:00
Commented Sep 16, 2022 at 18:40
@rdas, That is a good point. But how do I resume from where I left off the previous day?

Devansh Popat
– Devansh Popat

2022-09-16 18:40:49 +00:00
Commented Sep 16, 2022 at 18:40
You read the file at the start of the script loading all the file names into a set or something similar. Then when walking the directory tree, you can skip any files which are already in the set.

rdas
– rdas

2022-09-16 18:44:04 +00:00
Commented Sep 16, 2022 at 18:44

| Show 2 more comments

0 Your Answer

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Collectives™ on Stack Overflow

How to resume from previous failed iteration while traversing through a big nested directory in Python?

0

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Your Answer

Sign up or log in

Post as a guest

Linked