How can I improve CPU utilization when using the multiprocessing module?

Question

I am working in Python 3.4, performing a naive search against partitioned data in memory, and am attempting to fork processes to take advantage of all available processing power. I say naive, because I am certain there are other additional things that can be done to improve performance, but those potentials are out of scope for the question at hand.

The system I am testing on is a Windows 7 x64 environment.

What I would like to achieve is a relatively even, simultaneous distribution across cpu_count() - 1 cores (reading suggests that distributing against all cores rather than n-1 cores does not show any additional improvement due to baseline os system processes). So 75% pegged cpu Usage for a 4 core machine.

What I am seeing (using windows task manager 'performance tab' and the 'process tab') is that I never achieve greater than 25% system dedicated cpu utilization and that the process view shows computation occurring one core at a time, switching every few seconds between the forked processes.

I haven't instrumented the code for timing, but I am pretty sure that my subjective observations are correct in that I am not gaining the performance increase I expected (3x on an i5 3320m).

I haven't tested on Linux.

Based on the code presented: - How can I achieve 75% CPU utilization?

#pseudo code
def search_method(search_term, partition):
    <perform fuzzy search>
    return results

partitions = [<list of lists>]
search_terms = [<list of search terms>]

#real code
import multiprocessing as mp

pool = mp.Pool(processes=mp.cpu_count() - 1)

for search_term in search_terms:
    results = []
    results = [pool.apply(search_method, args=(search_term, partitions[x])) for x in range(len(partitions))]

You may want to know, that some scikit-learn functions have in-built options for going on multiple local-host cores. The point is, whether the solver computational strategy allows for non-intervening parallelised processing or not. The multiprocessing module has no clue whether it is possible to split the problem into more non-intervening parallel code-execution streams ( not speaking about data-access mechanics ) — user3666197
– user3666197, Commented Oct 10, 2014 at 0:48
If your problem allows, there may be a more powerfull approach, to use a cloud of workers ( all based on python, distributed on multi-host, multi-CPU/multi-core infrastructure ), that can perform your <_search_method_> for a <_search_term_> on a given <_list_of_lists_>. Thus you may harness 10x, 100x, 1000x more CPU/core-s into such a privateCloud/Grid-engine tasking. — user3666197
– user3666197, Commented Oct 10, 2014 at 0:54

dano · Accepted Answer · 2014-10-10 00:34:56Z

4

You're actually not doing anything concurrently here, because you're using pool.apply, which will block until the task you pass to it is complete. So, for every item in partitions, you're running search_method in some process inside of pool, waiting for it to complete, and then moving on to the next item. That perfectly coincides with what you're seeing in the Windows process manager. You want pool.apply_async instead:

for search_term in search_terms:
    results = []
    results = [pool.apply_async(search_method, args=(search_term, partitions[x])) for x in range(len(partitions))]

    # Get the actual results from the AsyncResult objects returned.
    results = [r.get() for r in results]

Or better yet, use pool.map (along with functools.partial to enable passing multiple arguments to our worker function):

from functools import partial
...

for search_term in search_terms:
    func = partial(search_method, search_term)
    results = pool.map(func, partitions)

answered Oct 10, 2014 at 0:34

dano

95.5k21 gold badges234 silver badges231 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

David Matthews Over a year ago

this answers the specific question at hand. Thank you! I've been looking for an excuse to jump into cloud/grid computation and this may be the jumping off place...

Collectives™ on Stack Overflow

How can I improve CPU utilization when using the multiprocessing module?

1 Answer 1

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related