2

I would like to use dask to do the following operation; let say I have a numpy array:
In: x = np.arange(5)
Out: [0,1,2,3,4]
Then I want a function to map np.arange to all the elements of my array. I have already defined a function for that purpose:

def list_range(array, no_cell):
    return np.add.outer(array, np.arange(no_cell)).T

# e.g
In: list_range(x,3)
Out: array([[0, 1, 2, 3, 4],
       [1, 2, 3, 4, 5],
       [2, 3, 4, 5, 6]])

Now I want to reproduce this in parallel using map_blocks on a dask array but I always get an error. Here is my attempt based on the dask documentation of map_blocks:

constant = 4
d = da.arange(5, chunks=(2,))
f = da.core.map_blocks(list_range, d, constant, chunks=(2,))
f.compute()

I get ValueError: could not broadcast input array from shape (4,2) into shape (4)

1 Answer 1

1

Have you checked out Dask's ufunc methods? For your problem, you can try,

da.add.outer(d, np.arange(constant)).T.compute()

While using map_blocks, you have to make sure that you specify the new dimensions when your operation results in a change in chunk dimensions. In your problem, the chunk dimension is no more (2,), and instead is (2,4). This new dimension should be specified using the new_axis parameter. Also, I found that map_blocks is not vstacking the blocks after map_blocks, and I couldn't get the transpose to work within the mapped function. Try this to make map_blocks work,

def list_range(array, no_cell):
    return np.add.outer(array, np.arange(no_cell))

constant = 4
d = da.arange(5, chunks=(2,))


f=da.core.map_blocks(list_range, d, constant, chunks=(2,constant), new_axis=[1])
f.T.compute()
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.