3

I have defined a function that takes single integer input and will return the output integer.

def get_output(n):
  output = # process the integer
  return output

now I have defined an input list that has to be processed using the function defined above.

input_list = [1,2,3,5,6,8,5,5,8,6,5,2,5,2,5,4,5,2]

Now I have defined an empty output list that will store the output from the function.

output_list = []

Now I want to go through every single item of input_list and append it to the output_list. I know how to achieve this using a sequential manner but I want to know how to parallelize this task. Thanks in advance.

3
  • 1
    Be warned, you will probably lose most if not all performance gains from synchronization overhead and interprocess communication if that is what you care about. This will only be worthwhile if the "process the integer" step is CPU or IO intensive. The actual appending process will not benefit from parallelizing. Commented Oct 11, 2021 at 7:06
  • 1
    @Andrew-Harelson yes I'm trying the best efficiency way but could not figure it out Commented Oct 11, 2021 at 7:07
  • 1
    Is your "process the integer" step CPU or IO intensive? If not, doing this sequentially will be faster than in parallel. If they are, is it more CPU or more IO intensive? I can provide an answer but the best way to do this depends on CPU vs. IO bound. Commented Oct 11, 2021 at 7:10

1 Answer 1

4

IIUC you need:

If your integer process is more IO bound, threads might work better.

Threads are more IO intensive, therefore if that's what you need you could try:

from concurrent.futures import ThreadPoolExecutor
def get_output(n):
    output = n ** 2
    return output

input_list = [1,2,3,5,6,8,5,5,8,6,5,2,5,2,5,4,5,2]
output_list = []

if __name__ == '__main__':
    with ThreadPoolExecutor(max_workers=6) as pool:
        output_list.extend(pool.map(get_output, input_list))
        
print(output_list)

This processes the list and squares all the elements, it applies this to every 6 elements parallelly, as you can see I specified max_workers=6.

If your integer process is more CPU bound, go with multiprocessing.

With the virtually same code:

from concurrent.futures import ProcessPoolExecutor
def get_output(n):
    output = n ** 2
    return output

input_list = [1,2,3,5,6,8,5,5,8,6,5,2,5,2,5,4,5,2]
output_list = []

if __name__ == '__main__':
    with ProcessPoolExecutor(max_workers=6) as pool:
        output_list.extend(pool.map(get_output, input_list))
        
print(output_list)

This does the same, it processes and squares all elements for every 6 elements parellelly.

Both codes output:

[1, 4, 9, 25, 36, 64, 25, 25, 64, 36, 25, 4, 25, 4, 25, 16, 25, 4]
Sign up to request clarification or add additional context in comments.

3 Comments

thanks for the answer. If the function taking a small time then running using multiprocessing will cost more time than run it sequentially. is there any way to get better performance with multiprocessing?
@Darkknight It depends, I guess this is the best way... For your integer processing, you should also look into numpy functions to do those, because numpy's backend is in C, so it's faster.
yes I know that I just post this question to get a clear picture of the idea actual function is different so I'll try that and let you know how it performs with multiprocessing. Thanks again

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.