Dynamic advanced indexing of numpy array

Question

I want to implement a function that can compute basic math operations on large array (that won't whole fit in RAM). Therefor I wanted to create a function that will process given operation block by block over selected axis. Main thought of this function is like this:

def process_operation(inputs, output, operation):
    shape = inputs[0].shape
    for index in range(shape[axis]):
        output[index,:] = inputs[0][index:] + inputs[1][index:]

but I want to be able to change the axis by that the blocks should be sliced/indexed.

is it possible to do indexing some sort of dynamic way, not using the ':' syntactic sugar?

I found some help here but so far wasn't much helpful:

thanks

Have you had a look at numpy.memmap for on-disk/out-of-RAM arrays? — Nils Werner
– Nils Werner, Commented Sep 10, 2019 at 11:54
Yes, but after some googling, I found more useful use of h5py link. But when I perform basic math operations like np.add() on them it loads whole "datasets" to memory. When I try to pass another dataset as output parameter -> I get TypeError, that I'm not passing Arraylike object. I have bunch of arrays and I need to process some Basic math operations and some convolutions -> and pass results between compute blocks (performing the operations) — Miroslav Karpíšek
– Miroslav Karpíšek, Commented Sep 10, 2019 at 12:02
@NilsWerner so far Ill stick with your suggestion to use numpy.memmap. Thanks — Miroslav Karpíšek
– Miroslav Karpíšek, Commented Sep 10, 2019 at 12:16

myrtlecat · Accepted Answer · 2019-09-11 16:09:15Z

I think you could achieve what you want using python's builtin slice type.

Under the hood, :-expressions used inside square brackets are transformed into instances of slice, but you can also use a slice to begin with. To iterate over different axes of your input you can use a tuple of slices of the correct length.

This might look something like:

def process_operation(inputs, output, axis=0):
    shape = inputs[0].shape
    for index in range(shape[axis]):
        my_slice = (slice(None),) * axis + (index,)
        output[my_slice] = inputs[0][my_slice] + inputs[1][my_slice]

I believe this should work with h5py datasets or memory-mapped arrays without any modifications.

Background on `slice` and `getitem`

slice works in conjunction with the __getitem__ to evaluate the x[key] syntax. x[key] is evaluated in two steps:

If key contains any expressions such as :, i:j or i:j:k then these are de-sugared into slice instances.
key is passed to the __getitem__ method of the object x. This method is responsible for returning the correct value of x[key]

For the example the expressions:

x[2]
y[:, ::2]

are equivalent to:

x.__getitem__(2)
y.__getitem__((slice(None), slice(None, None, 2)))

You can explore how values are converted to slices using a class like the following:

class Sliceable:
    def __getitem__(self, key):
        print(key)

x = Sliceable()
x[::2] # prints "slice(None, None, 2)"

Collectives™ on Stack Overflow

Dynamic advanced indexing of numpy array

1 Answer 1

Background on `slice` and `getitem`

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Background on slice and __getitem__

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related

Background on `slice` and `getitem`