1

I want to implement a function that can compute basic math operations on large array (that won't whole fit in RAM). Therefor I wanted to create a function that will process given operation block by block over selected axis. Main thought of this function is like this:

def process_operation(inputs, output, operation):
    shape = inputs[0].shape
    for index in range(shape[axis]):
        output[index,:] = inputs[0][index:] + inputs[1][index:]

but I want to be able to change the axis by that the blocks should be sliced/indexed.

is it possible to do indexing some sort of dynamic way, not using the ':' syntactic sugar?

I found some help here but so far wasn't much helpful:

thanks

3
  • 1
    Have you had a look at numpy.memmap for on-disk/out-of-RAM arrays? Commented Sep 10, 2019 at 11:54
  • Yes, but after some googling, I found more useful use of h5py link. But when I perform basic math operations like np.add() on them it loads whole "datasets" to memory. When I try to pass another dataset as output parameter -> I get TypeError, that I'm not passing Arraylike object. I have bunch of arrays and I need to process some Basic math operations and some convolutions -> and pass results between compute blocks (performing the operations) Commented Sep 10, 2019 at 12:02
  • @NilsWerner so far Ill stick with your suggestion to use numpy.memmap. Thanks Commented Sep 10, 2019 at 12:16

1 Answer 1

2

I think you could achieve what you want using python's builtin slice type.

Under the hood, :-expressions used inside square brackets are transformed into instances of slice, but you can also use a slice to begin with. To iterate over different axes of your input you can use a tuple of slices of the correct length.

This might look something like:

def process_operation(inputs, output, axis=0):
    shape = inputs[0].shape
    for index in range(shape[axis]):
        my_slice = (slice(None),) * axis + (index,)
        output[my_slice] = inputs[0][my_slice] + inputs[1][my_slice]

I believe this should work with h5py datasets or memory-mapped arrays without any modifications.

Background on slice and __getitem__

slice works in conjunction with the __getitem__ to evaluate the x[key] syntax. x[key] is evaluated in two steps:

  1. If key contains any expressions such as :, i:j or i:j:k then these are de-sugared into slice instances.
  2. key is passed to the __getitem__ method of the object x. This method is responsible for returning the correct value of x[key]

For the example the expressions:

x[2]
y[:, ::2]

are equivalent to:

x.__getitem__(2)
y.__getitem__((slice(None), slice(None, None, 2)))

You can explore how values are converted to slices using a class like the following:

class Sliceable:
    def __getitem__(self, key):
        print(key)

x = Sliceable()
x[::2] # prints "slice(None, None, 2)"
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.