0

I'm using Ipython parallel in an optimisation algorithm that loops a large number of times. Parallelism is invoked in the loop using the map method of a LoadBalancedView (twice), a DirectView's dictionary interface and an invocation of a %px magic. I'm running the algorithm in an Ipython notebook.

I find that the memory consumed by both the kernel running the algorithm and one of the controllers increases steadily over time, limiting the number of loops I can execute (since available memory is limited).

Using heapy, I profiled memory use after a run of about 38 thousand loops:

Partition of a set of 98385344 objects. Total size = 18016840352 bytes.  
 Index  Count     %       Size   %  Cumulative   % Kind (class / dict of class)
     0  5059553   5 9269101096  51  9269101096  51 IPython.parallel.client.client.Metadata
     1 19795077  20 2915510312  16 12184611408  68 list
     2 24030949  24 1641114880   9 13825726288  77 str
     3  5062764   5 1424092704   8 15249818992  85 dict (no owner)
     4 20238219  21  971434512   5 16221253504  90 datetime.datetime
     5   401177   0  426782056   2 16648035560  92 scipy.optimize.optimize.OptimizeResult
     6        3   0  402654816   2 17050690376  95 collections.defaultdict
     7  4359721   4  323814160   2 17374504536  96 tuple
     8  8166865   8  196004760   1 17570509296  98 numpy.float64
     9  5488027   6  131712648   1 17702221944  98 int 
<1582 more rows. Type e.g. '_.more' to view.>

You can see that about half the memory is used by IPython.parallel.client.client.Metadata instances. A good indicator that results from the map invocations are being cached is the 401177 OptimizeResult instances, the same number as the number of optimize invocations via lbview.map - I am not caching them in my code.

Is there a way I can control this memory usage on both the kernel and the Ipython parallel controller (who'se memory consumption is comparable to the kernel)?

1 Answer 1

1

Ipython parallel clients and controllers store past results and other metadata from past transactions.

The IPython.parallel.Client class provides a method for clearing this data:

Client.purge_everything()

documented here. There is also purge_results() and purge_local_results() methods that give you some control over what gets purged.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.