0

I have been banging my head against Multiprocessing in Python for the better part of the day now, and I've managed to make very little progress - I apologize if my question is a duplicate or my ignorance is apparent - I couldn't find it asked anywhere else in this way.

I'm looking for a way to run functions in parallel, and return some arbitrary thing they've produced back to the main script.

The question is: Can a Process() started from Multiprocessing return a list or some other arbitrary variable type?

For example, I would like to:

def 30_second_function():
    #pretend this takes 30 seconds to run
    return ["mango", "habanero", "salsa"]
#End 30_second_function()

def 5_second_function():
    #pretend this takes 5 seconds to run
    return {"beans": "8 oz", "tomato paste": "16 oz"}
#End 5_second_function()

p1 = multiprocessing.Process(target=30_second_function)
p1.start()
p2 = multiprocessing.Process(target=5_second_function)
p2.start()

#Somehow retrieve the list and the dictionary here.  p1.returned??

And then somehow access the list from 30_second_function and the dictionary from 5_second_function. Is this possible? Am I going about this the wrong way?

1 Answer 1

3

Process itself does not provide a way to get return value. To exchange data between processes, you need to use queue, pipe, shared memory, ...:

import multiprocessing

def thirty_second_function(q):
    q.put(["mango", "habanero", "salsa"])

def five_second_function(q):
    q.put({"beans": "8 oz", "tomato paste": "16 oz"})

if __name__ == '__main__':
    q1 = multiprocessing.Queue()
    p1 = multiprocessing.Process(target=thirty_second_function, args=(q1,))
    p1.start()

    q2 = multiprocessing.Queue()
    p2 = multiprocessing.Process(target=five_second_function, args=(q2,))
    p2.start()

    print(q1.get())
    print(q2.get())

Alternative using multiprocessing.pool.Pool:

import multiprocessing.pool

def thirty_second_function():
    return ["mango", "habanero", "salsa"]

def five_second_function():
    return {"beans": "8 oz", "tomato paste": "16 oz"}

if __name__ == '__main__':
    p = multiprocessing.pool.Pool()
    p1 = p.apply_async(thirty_second_function)
    p2 = p.apply_async(five_second_function)

    print(p1.get())
    print(p2.get())

Or using concurrent.futures module (also available in standard library since Python 3.2+):

from concurrent.futures import ProcessPoolExecutor

def thirty_second_function():
    return ["mango", "habanero", "salsa"]

def five_second_function():
    return {"beans": "8 oz", "tomato paste": "16 oz"}

if __name__ == '__main__':
    with ProcessPoolExecutor() as e:
        p1 = e.submit(thirty_second_function)
        p2 = e.submit(five_second_function)
    print(p1.result())
    print(p2.result())
Sign up to request clarification or add additional context in comments.

6 Comments

This is exactly what I needed - I'm not sure why I didn't understand this from the documentation, but your example here has really helped. I can't wait to try it. Off topic - do you know why the args=(q1,) needs that seemingly rogue comma to function properly?
@Locane, without a trailing comma, it beccame (q1) which is equal to q1. You can use list form instead: [q1] if you don't like it.
@Locane, You're right. The trailing comma is there to denote that, it's a tuple literal. BTW, it also accepts a list.
@senderle, I forgot it. Thank you for reminding it. I updated the answer to add it.
Falsetru in your Pools example (which is what I ended up using) can you please mention that object functions don't work properly? IE, self.gather_data called in the pool will not work, but gather_data will. This was a "gotcha" that was not obvious to me until I experimented with it.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.