4

Some numpy functions (logically) return scalars:

>>> my_arr = np.ndarray(shape=(1,))
>>> type(np.max(my_arr))
<type 'numpy.float64'>

but only when called with an ndarray, rather than a subclass:

>>> class CustomArray(np.ndarray):
...     pass
>>> my_arr = CustomArray(shape=(1,))
>>> type(np.max(my_arr))
<class '__main__.CustomArray'>

Why is this? I'd expect either both to return a scalar (of type <type 'numpy.float64'>, or the former to return a np.ndarray instance and the latter a CustomArray instance. But instead, I get a combination of these two behaviours. Can I change this behaviour through changing my own class?

I don't see anything that would explain this on the doc page discussing subclassing ndarray (http://docs.scipy.org/doc/numpy-1.9.2/user/basics.subclassing.html).

(Running Python 2.7.10, numpy 1.9.2, in case it matters.)

2 Answers 2

1

This is because max() is not overloaded in CustomArray. If you try it, my_array.max() returns an object of CustomArray instead of scalar.

my_array = CustomArray(shape=(1,))
print my_array.max()
>> CustomArray(9.223372036854776e+18)

np.max internally calls np.amax, which ends up calling np.maximum.reduce. This is the standard reduce of map-reduce and returns a base-object returned by max. Hence, the type returned by np.max is in fact the type returned by max() method called on your object. You can override it as:

class CustomArray(np.ndarray):
   def max(self, axis, out):
      return np.ndarray(self.shape, buffer=self).max(axis, out)

type(np.max(my_arr))
>> numpy.float64

The trick is to upcast self as an np.ndarray and find max using it.

Sign up to request clarification or add additional context in comments.

6 Comments

I'm guessing then that I have to change pretty much every method that ndarray instances get? (I was just using max as an example.)
Also, this doesn't answer the question of why the instance of a subclass behaves differently from an instance of ndarray, if the subclass doesn't actually alter any behaviour.
Yes, you probably have to update every relevant method, if you want the right return type. The class body does not update anything, but the default overrides of max() (and additional methods like sum()) are different from base np.ndarray.
Why are they different though? Is there an explicit check in ndarray.max to check that self is actually an instance of ndarray and not a subclass or something?
Look at the code for np.matrix or masked to see how they handle this.
|
0

I also ran into this, solved it for all aggregation/reduction operations in NumPy by implementing a custom __array_wrap__:

import numpy as np

class CustomArray(np.ndarray):
    def __array_wrap__(self, obj, **kwargs):
        if obj.shape == ():
            return obj[()]
        else:
            return super().__array_wrap__(obj, **kwargs)

Example return types for various operations:

>>> a = CustomArray(shape=(3,))
>>> type(np.max(a))
<class 'numpy.float64'>
>>> type(np.median(a))
<class 'numpy.float64'>
>>> type(np.exp(a))
<class '__main__.CustomArray'>
>>> 

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.