Access a single element in large published array with Dask

Question

Is there a faster way to only retrieve a single element in a large published array with Dask without retrieving the entire array?

In the example below client.get_dataset('array1')[0] takes roughly the same time as client.get_dataset('array1').

import distributed
client = distributed.Client()
data = [1]*10000000
payload = {'array1': data}
client.publish(**payload)

one_element = client.get_dataset('array1')[0]

MRocklin · Accepted Answer · 2017-07-20 21:39:19Z

5

Note that anything you publish goes to the scheduler, not to the workers, so this is somewhat inefficient. Publish was intended to be used with Dask collections like dask.array.

Client 1

import dask.array as da
x = da.ones(10000000, chunks=(100000,))  # 1e7 size array cut into 1e5 size chunks
x = x.persist()  # persist array on the workers of the cluster

client.publish(x=x)  # store the metadata of x on the scheduler

Client 2

x = client.get_dataset('x')  # get the lazy collection x
x[0].compute()  # this selection happens on the worker, only the result comes down

answered Jul 20, 2017 at 21:39

MRocklin

57.5k29 gold badges176 silver badges245 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

sudouser2010 Over a year ago

I would like to mark this question as complete but, when I run this code snippet using the approach suggested, it hangs at a[0].compute(). For the actual code snippet I use, please see the link below: stackoverflow.com/questions/45468673/…

sudouser2010 Over a year ago

I was able to get the code snippet you posted working. In order to get it working I had to make sure that workers were also created for the scheduler b/c in my version. For anyone else, see my link above for more info.

Collectives™ on Stack Overflow

Access a single element in large published array with Dask

1 Answer 1

Client 1

Client 2

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Client 1

Client 2

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related