2

Question: Does scipy.optimize have minimizing functions that can divide their workload among multiple processes to save time? If so, where can I find the documentation?

I've looked a fair amount online, including here, for answers:

I could be misunderstanding, but I don't see a clear indication in any of the above posts that the scipy library is informed of the fact that there are multiple processes that it can utilize simultaneously while also providing the minimization functions with all of the arguments needed to determine the minimum.

I also don't see multiprocessing discussed in detail in the scipy docs that I read and I haven't had any luck finding real world examples of optimization gains to justify optimizing versus a parallel brute force effort. Here's a fictional example of what I'd like the scipy.optimize library to do (I know that the differential_evolution function doesn't have a multiprocessing argument):

import multiprocessing as mp
from scipy.optimize import differential_evolution

def objective_function(x):
    return x[0] * 2

pool = mp.Pool(processes=16)

# Perform differential evolution optimization
result = differential_evolution(objective_function, multiprocessing = pool)

3 Answers 3

3

With respect to scipy.optimize.differential_evolution, it does seem to offer multiprocessing through multiprocessing.Pool via the optional "workers" call parameter, according to the official documentation at https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.differential_evolution.html#scipy.optimize.differential_evolution

This may also be offered for other optimization methods but the API documents would need to be examined. The docs also say that the objective function must be pickleable.

The official docs also have some general remarks on parallel execution with SciPy at https://docs.scipy.org/doc/scipy/tutorial/parallel_execution.html

The call would look like this for differential_evolution:

import multiprocessing as mp
from scipy.optimize import differential_evolution

def objective_function(x):
    return x[0] * 2

my_workers = 16

# Perform differential evolution optimization
result = differential_evolution(objective_function,  workers = my_workers)
Sign up to request clarification or add additional context in comments.

Comments

2

If so, where can I find the documentation?

This is documented in the SciPy User Guide in the section Optimization / Parallel Execution Support. You can tell which optimization functions support parallel objective evaluation by looking for a parameter called workers. All functions which have this parameter do at least some part of the optimization in parallel. All functions which do not have this parameter do not.

I also don't see multiprocessing discussed in detail in the scipy docs that I read and I haven't had any luck finding real world examples of optimization gains to justify optimizing versus a parallel brute force effort.

It's very common in optimization to find that a fast serial algorithm that is well-suited to your problem will outperform a slow parallel algorithm.

For example, your example function of f(x) = 2*x. You could optimize this with differential_evolution.

%timeit differential_evolution(objective_function, bounds=[(-100, 100)])
9.75 ms ± 124 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

You could even parallelize this:

%timeit result = differential_evolution(objective_function, updating='immediate', bounds=[(-100, 100)], workers=4)
58.7 ms ± 5.14 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Unfortunately, this is slower, because it takes longer to tell another process to run the function than it takes to run the function in the same process. (This might be more worthwhile if you had a more expensive objective function.)

A better way to improve the speed, instead of parallelism, is to use a local minimizer. You can do this because this function is differentiable and has no local minima.

%timeit minimize(objective_function, x0=[0], bounds=[(-100, 100)])
1.07 ms ± 16.4 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

However, this doesn't parallelize very well, because the points that the minimizer will query in iteration 2 depend on the value it gets from your function in iteration 1.

My advice would be that before you think about parallelism, think about whether you are using the right optimizer for the job, and think about whether you can reduces the number of variables. Both of these can significantly improve speed.

Comments

1

The yet to be release 1.16.0 will offer parallelisation for many of the minimize methods, e.g. https://scipy.github.io/devdocs/reference/optimize.minimize-lbfgsb.html, where the workers keyword is used to parallelise numerical differentiation.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.