1

can anyone suggest how to process files parallel, please?

right now I can hash/checksum files but already processed files wait for first task to finish.

Lets say you have enough I/O to process more.

How do I write following algorithm:

hash/checksum files + copy what is already checksummed (parallel) - basically, two processes running hash + copy

I do not know how to explain, hopefully you understand.

I have already written program in Python but wonder how can I write parallel version of this program.

Regards

David

2
  • Simplest way is to run multiple copies of the python script with wildcards that subset which files they read ! Commented Jul 26, 2018 at 23:21
  • Your operations seem sequential, from what you said. Hash followed by copy. However, to help you please provide your attempts. That might give those who intend to help some idea what you want. Commented Jul 26, 2018 at 23:24

1 Answer 1

3

This sounds like a job for joblib.

import os
from joblib import Parallel, delayed

files = os.listdir(the_dir)

def hash_checksum_copy(file):
    [your logic here]

Parallel(n_jobs=[your n cores])(delayed(hash_checksum_copy)(file) for file in files)

Good luck. :)

Sign up to request clarification or add additional context in comments.

3 Comments

thanks I try. Parallel processing is simple if you have one task per file but there is two tasks per file and I have no idea how to do it. I don't know how to explain in first place :-D
Don't you need to import joblib?
@DavidHajes the idea is that all of your per-file tasks (hashing, checksum, etc.) should be wrapped into one function. When given that function and a list of files, you should be able to parallelize across n sublists of your list of files. As long as each file's results from that function do not depend on the results for other files, that part should be parallelizable.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.