How to fix array boolean error using filter function

Question

I am trying to use solve the Boolean error using filter

I used a filter array to solve the Boolean problem of iterating arrays. It worked for a simple list, however it is again showing error when used to take only those numbers which are greater than zero from an array. The method used to populate the array is drawing samples from a standard normal distribution.

   arr2 = np.array(list(filter(lambda x:x>0,rand_num)))
   arr2

<ipython-input-80-af65f7c09d82> in <module>
      1 rand_num = np.random.randn(5,5)
----> 2 arr2 = np.array(list(filter(lambda x:x>0,rand_num)))
      3 arr2
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Please provide an array sample and how would the desired output look like. — CristiFati
– CristiFati, Commented Sep 1, 2019 at 19:04
What is a filter array? What is the Boolean problem of iterating arrays? Even more important, what is rand_num? — Stop harming Monica
– Stop harming Monica, Commented Sep 1, 2019 at 19:13

willeM_ Van Onsem · Accepted Answer · 2019-09-01 19:05:05Z

3

Likely rand_num is a multidimensional array. In taht case the elements (so x) will be an array as well. For x > 0, this is an array of bools, but you can not say that an array of booleans is True or False. Imagine that an array contains two Trues and three Falses for example. Would you consider that True or not?

Using filter(..) is likely not necessary here. You can easily filter your array by subscripting it with an array of booleans:

arr2 = rand_num[rand_num > 0]

For example:

>>> rand_num[rand_num > 0]
array([1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1])

or we can construct a masked array if we want to retains the shape:

arr2 = np.ma.masked_array(rand_num, mask=rand_num <= 0)

This will yield:

>>> np.ma.masked_array(rand_num, mask=rand_num <= 0)
masked_array(
  data=[[--, 1, 1, --, --],
        [--, --, 1, --, --],
        [--, 1, 2, --, --],
        [--, --, --, 1, --],
        [1, 1, 1, 1, 1]],
  mask=[[ True, False, False,  True,  True],
        [ True,  True, False,  True,  True],
        [ True, False, False,  True,  True],
        [ True,  True,  True, False,  True],
        [False, False, False, False, False]],
  fill_value=999999)

edited Sep 1, 2019 at 19:05

answered Sep 1, 2019 at 18:56

willeM_ Van Onsem

482k33 gold badges483 silver badges624 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

CristiFati Over a year ago

rand_num = np.random.randn(5,5). This will flatten it.

willeM_ Van Onsem Over a year ago

@CristiFati: correct, but even a two level filter will need to flatten it: how would you for exampe construct a new array with one row having two elements and another one having four elements?

hpaulj · Accepted Answer · 2019-09-02 02:50:56Z

You have created a 2d array of floats:

In [60]: rand_num = np.random.randn(5,5)                                                                     
In [61]: rand_num                                                                                            
Out[61]: 
array([[ 1.89811694,  0.44414858, -2.52994217, -0.17974148, -0.91167712],
       [ 0.06534556,  0.04677172, -0.81580021,  0.08053772, -0.55459303],
       [ 0.41316473, -0.35859064,  1.28860476, -0.22666389,  0.97562048],
       [ 0.29465373,  0.71143579, -0.55552921,  0.37660919,  0.31482962],
       [ 0.2768353 , -1.32999438,  0.0594767 ,  1.50255302,  0.08658897]])

We can select the ones that are >0 with a boolean mask:

In [62]: rand_num>0                                                                                          
Out[62]: 
array([[ True,  True, False, False, False],
       [ True,  True, False,  True, False],
       [ True, False,  True, False,  True],
       [ True,  True, False,  True,  True],
       [ True, False,  True,  True,  True]])
In [63]: rand_num[rand_num>0]                                                                                
Out[63]: 
array([1.89811694, 0.44414858, 0.06534556, 0.04677172, 0.08053772,
       0.41316473, 1.28860476, 0.97562048, 0.29465373, 0.71143579,
       0.37660919, 0.31482962, 0.2768353 , 0.0594767 , 1.50255302,
       0.08658897])

Boolean indexing of a array produces a 1d array - because each row can vary in the number of True values.

filter like map iterates on the first dimension of the array:

In [64]: list(map(lambda x:x>0, rand_num))                                                                   
Out[64]: 
[array([ True,  True, False, False, False]),
 array([ True,  True, False,  True, False]),
 array([ True, False,  True, False,  True]),
 array([ True,  True, False,  True,  True]),
 array([ True, False,  True,  True,  True])]

same thing in list comprehension form:

In [65]: [x>0 for x in rand_num]                                                                             
Out[65]: 
[array([ True,  True, False, False, False]),
 array([ True,  True, False,  True, False]),
 array([ True, False,  True, False,  True]),
 array([ True,  True, False,  True,  True]),
 array([ True, False,  True,  True,  True])]

Notice how each element of the iteration is a numpy array of shape (5,). That's what the filter is choking on. It expects a simple True/False boolean, not an array. Python if and or have the same problem. (Actually I think it's numpy that's refusing to pass the multi-item array to the Python function that expects the scalar, and instead raises this ambiguity error.)

You could apply the filter to each row of rand_num:

In [66]: [list(filter(lambda x: x>0, row)) for row in rand_num]                                              
Out[66]: 
[[1.898116938827415, 0.4441485849428062],
 [0.06534556093009064, 0.04677172433407727, 0.08053772013844711],
 [0.41316473050686314, 1.2886047644946972, 0.9756204798856322],
 [0.2946537313273924,
  0.711435791237748,
  0.3766091899348284,
  0.31482961532956577],
 [0.27683530300005493,
  0.05947670354791034,
  1.502553021817318,
  0.0865889738396504]]

These are the same numbers as in Out[63], but split up by row - with a different number of items in each.

The same thing in the @Willem Van Onsem's masked array format:

In [69]: np.ma.masked_array(rand_num, mask=rand_num <= 0)                                                    
Out[69]: 
masked_array(
  data=[[1.898116938827415, 0.4441485849428062, --, --, --],
        [0.06534556093009064, 0.04677172433407727, --,
         0.08053772013844711, --],
        [0.41316473050686314, --, 1.2886047644946972, --,
         0.9756204798856322],
        [0.2946537313273924, 0.711435791237748, --, 0.3766091899348284,
         0.31482961532956577],
        [0.27683530300005493, --, 0.05947670354791034, 1.502553021817318,
         0.0865889738396504]],
  mask=[[False, False,  True,  True,  True],
        [False, False,  True, False,  True],
        [False,  True, False,  True, False],
        [False, False,  True, False, False],
        [False,  True, False, False, False]],
  fill_value=1e+20)

Collectives™ on Stack Overflow

How to fix array boolean error using filter function

2 Answers 2

2 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related