0

I am trying to delete rows from arrays which are stored inside an object array in numpy. However as you can see it complains that it cannot broadcast the smaller array into the larger array. Works fine when done directly to the array. What is the issue here? Any clean way around this error other than making a new object array and copying one by one until the array I want to modify?

In [1]: import numpy as np

In [2]: x = np.zeros((3, 2))

In [3]: x = np.delete(x, 1, axis=0)

In [4]: x
Out[4]: 
array([[ 0.,  0.],
       [ 0.,  0.]])

In [5]: x = np.array([np.zeros((3, 2)), np.zeros((3, 2))], dtype=object)

In [6]: x[0] = np.delete(x[0], 1, axis=0)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-6-1687d284d03c> in <module>()
----> 1 x[0] = np.delete(x[0], 1, axis=0)

ValueError: could not broadcast input array from shape (2,2) into shape (3,2)

Edit: Apparently it works when arrays are different shape. This is quite annoying. Any way to disable automatic concatenation by np.array?

In [12]: x = np.array([np.zeros((3, 2)), np.zeros((5, 8))], dtype=object)

In [13]: x[0] = np.delete(x[0], 1, axis=0)

In [14]: x = np.array([np.zeros((3, 2)), np.zeros((3, 2))], dtype=object)

In [15]: x.shape
Out[15]: (2, 3, 2)

In [16]: x = np.array([np.zeros((3, 2)), np.zeros((5, 8))], dtype=object)

In [17]: x.shape
Out[17]: (2,)

This is some quite inconsistent behaviour.

4
  • You have created a 3D array there. Are you aware of it? Also, on the error itself, the place you are assigning into has a different shape than array to be assigned. Commented Jun 7, 2017 at 13:58
  • I would not call it a 3D array since each element of x can have different size. I changed the example now to point it out. I'd rather consider it an array of arrays (kind of like a list of arrays). The error I understand. I don't see how I can delete rows in those arrays though. Commented Jun 7, 2017 at 14:03
  • Oh wait you are right. It works when the arrays are different shapes. Dammit. So numpy checks if they are all same shape and converts to 3D array. Commented Jun 7, 2017 at 14:04
  • So x = np.array([np.zeros((3, 2)), np.zeros((8, 5))], dtype=object) works but not if both are (3, 2). That's annoying. Commented Jun 7, 2017 at 14:05

2 Answers 2

1

The fact that np.array creates as high a dimensional array as it can has been discussed many times on SO. If the elements are different in size it will keep them separate, or in some cases raise an error.

In your example

In [201]: x = np.array([np.zeros((3, 2)), np.zeros((3, 2))], dtype=object)
In [202]: x
Out[202]: 
array([[[0.0, 0.0],
        [0.0, 0.0],
        [0.0, 0.0]],

       [[0.0, 0.0],
        [0.0, 0.0],
        [0.0, 0.0]]], dtype=object)

The safe way to make an object array of a determined size is to initialize it and then fill it:

In [203]: x=np.empty(2, dtype=object)
In [204]: x
Out[204]: array([None, None], dtype=object)
In [205]: x[...] = [np.zeros((3, 2)), np.zeros((3, 2))]
In [206]: x
Out[206]: 
array([array([[ 0.,  0.],
       [ 0.,  0.],
       [ 0.,  0.]]),
       array([[ 0.,  0.],
       [ 0.,  0.],
       [ 0.,  0.]])], dtype=object)

A 1d object array like this, is, for most practical purposes a list. Operations on the elements are performed with Python level iteration, implicit or explicit (as with your list comprehension). Most of the computational power of a multidimensional numeric array is gone.

In [207]: x.shape         
Out[207]: (2,)
In [208]: [xx.shape for xx in x]   # shape of the elements
Out[208]: [(3, 2), (3, 2)]
In [209]: [xx[:2,:] for xx in x]    # slice the elements
Out[209]: 
[array([[ 0.,  0.],
        [ 0.,  0.]]), array([[ 0.,  0.],
        [ 0.,  0.]])]

You can reshape such an array, but you can't append as if it were a list. Some math operations cross the 'object' boundary, but it is hit-and-miss. In sum, don't use object arrays when a list would work just as well.

Understanding non-homogeneous numpy arrays

Sign up to request clarification or add additional context in comments.

2 Comments

My problem with lists is I can't index them with an array. If I want to for example get arrays [3, 16, 7] I would need to loop over the list to get them or use itemgetter.
Yes, object arrays do have all the array indexing options, where as list indexing is 'cruder'.
0

This is very hacky and ugly in my opinion but it's the only solution I could think of. Use list comprehension to convert object array to list (using .tolist() does not work as it breaks the subarrays into lists), modifying the element and converting back to object array.

In [37]: x = np.array([np.zeros((3, 2)), np.zeros((3, 2))], dtype=object)

In [38]: xx = [z for z in x]

In [39]: xx[0] = np.delete(xx[0], 1, axis=0)

In [40]: x = np.array(xx, dtype=object)

In [41]: x
Out[41]: 
array([array([[0.0, 0.0],
       [0.0, 0.0]], dtype=object),
       array([[0.0, 0.0],
       [0.0, 0.0],
       [0.0, 0.0]], dtype=object)], dtype=object)

I think I'll post an issue on the numpy github for consistent behavior of object arrays

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.