0

As a Python beginner, I'm trying to parallelize some sections of a function that serves as an input to an optimization routine. This function f returns the log-likelihood, the gradient and the hessian for a given vector b. In this function there are three independent loop functions: loop_1, loop_2, and loop_3.

What is the most efficient implementation? Parallelizing the three loop functions in three concurrent processes or parallelizing one loop at a time? And how can this be implemented? When using the multiprocessing package I get a 'pickle' error, as my nested loop functions are not in the general namespace.

def f(b):

  # Do something computational intensive on b

  def calc(i, j):
    return u, v, w

  def loop_1():
    for i in range(1:1000):
      c, d, e = calc(i, 0)

      for j in range(1:200):
        f, g, h = calc(i, j)

    return x, y, z

  def loop_2():
    # similar to loop_1

  def loop_3():
    # similar to loop_1

  # Aggregate results from the three loops

  return u, v, w
0

2 Answers 2

1

There are several ways to avoid the pickling error you receive.

An option could be asynchronous, if makes sense to do so. Sometimes it makes it slower, sometimes it makes it slower.

In that case it would look something like the code bellow, I use it as a templet when I forget things:

import asyncio


def f():
    async def factorial(n):
        f.p = 2
        await asyncio.sleep(0.2)
        return 1 if n < 2 else n * await factorial(n-1)

    async def multiply(n, k):
        await asyncio.sleep(0.2)
        return sum(n for _ in range(k))

    async def power(n, k):
        await asyncio.sleep(0.2)
        return await multiply(n, await power(n, k-1)) if k != 0 else 1

    loop = asyncio.get_event_loop()
    tasks = [asyncio.ensure_future(power(2, 5)),
             asyncio.ensure_future(factorial(5))]
    f.p = 0
    ans = tuple(loop.run_until_complete(asyncio.gather(*tasks)))
    print(f.p)
    return ans

if __name__ == '__main__':
    print(f())

Async and await is builtin keywords like def, for, in and such in python3.5.

Another work around with functions in functions is to use threads instead.

from concurrent.futures import ThreadPoolExecutor
import time

def f():
    def factorial(n):
        f.p = 2
        time.sleep(0.2)
        return 1 if n < 2 else n*factorial(n-1)

    def multiply(n, k):
        time.sleep(0.2)
        return sum(n for _ in range(k))

    def power(n, k):
        time.sleep(0.2)
        return multiply(n, power(n, k-1)) if k != 0 else 1

    def calculate(func, args):
        return func(*args)

    def calculate_star(args):
        return calculate(*args)

    pool = ThreadPoolExecutor()
    tasks = [(power, (2, 5)), (factorial, (5, ))]
    f.p = 0
    result = list(pool.map(calculate_star, tasks))
    print(f.p)
    return result

if __name__ == '__main__':
    print(f())
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you for your answer! Do you know how I can adapt your code for functions that return multiple objects?
In the functions your going to use, return objects as you would in the normal case. In the functions I used for testing, you would have to use a tuple, list or dict, or more variables that follows along in the recursion.
0

You should start your function in a pool of process.

 import multiprocessing

 pool = multiprocessing.Pool()
 for i in range(3):
     if i == 0:
         pool.apply_async(loop_1)
     elif i == 1:
         pool.apply_async(loop_2)
     if i == 2:
         pool.apply_async(loop_3)
 pool.close()

If loop_1,loop_2 and loop_3 are the same functions with same operations you simply can call loop_3 three times.

1 Comment

This method will produce a pickling error for functions in functions.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.