How do I select elements of an array given condition?

Question

Suppose I have a numpy array x = [5, 2, 3, 1, 4, 5], y = ['f', 'o', 'o', 'b', 'a', 'r']. I want to select the elements in y corresponding to elements in x that are greater than 1 and less than 5.

I tried

x = array([5, 2, 3, 1, 4, 5])
y = array(['f','o','o','b','a','r'])
output = y[x > 1 & x < 5] # desired output is ['o','o','a']

but this doesn't work. How would I do this?

jfs · Accepted Answer · 2010-06-13 00:50:32Z

273

Your expression works if you add parentheses:

>>> y[(1 < x) & (x < 5)]
array(['o', 'o', 'a'], 
      dtype='|S1')

answered Jun 13, 2010 at 0:50

jfs

417k210 gold badges1k silver badges1.7k bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

MasterControlProgram Over a year ago

That is nice.. vecMask=1<x generates a vector mask like vecMask=(False, True, ...), which can be just combined with other vector masks. Each element is the condition for taking the elements of a source vector (True) or not (False). This can be used also with the full version numpy.extract(vecMask, vecSrc), or numpy.where(vecMask, vecSrc, vecSrc2).

calavicci Over a year ago

@JennyYueJin: It happens because of precedence. (Bitwise) & has higher precedence than < and >, which in turn have higher precedence than (logical) and. x > 1 and x < 5 evaulates the inequalities first and then the logical conjunction; x > 1 & x < 5 evaluates the bitwise conjunction of 1 and (the values in) x, then the inequalities. (x > 1) & (x < 5) forces the inequalities to evaluate first, so all of the operations occur in the intended order and the results are all well-defined. See docs here.

jfs Over a year ago

@ru111 It works on Python 3.6 too (there is no reason for it to stop working).

ru111 Over a year ago

I get "ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()"

jfs Over a year ago

@ru111 you should write (0 < x) & (x < 10) (as shown in the answer) instead of 0 < x < 10 which doesn't work for numpy arrays on any Python version.

|

Mark Mikofski · Accepted Answer · 2017-08-09 05:52:27Z

45

IMO OP does not actually want np.bitwise_and() (aka &) but actually wants np.logical_and() because they are comparing logical values such as True and False - see this SO post on logical vs. bitwise to see the difference.

>>> x = array([5, 2, 3, 1, 4, 5])
>>> y = array(['f','o','o','b','a','r'])
>>> output = y[np.logical_and(x > 1, x < 5)] # desired output is ['o','o','a']
>>> output
array(['o', 'o', 'a'],
      dtype='|S1')

And equivalent way to do this is with np.all() by setting the axis argument appropriately.

>>> output = y[np.all([x > 1, x < 5], axis=0)] # desired output is ['o','o','a']
>>> output
array(['o', 'o', 'a'],
      dtype='|S1')

by the numbers:

>>> %timeit (a < b) & (b < c)
The slowest run took 32.97 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 1.15 µs per loop

>>> %timeit np.logical_and(a < b, b < c)
The slowest run took 32.59 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 1.17 µs per loop

>>> %timeit np.all([a < b, b < c], 0)
The slowest run took 67.47 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 5.06 µs per loop

so using np.all() is slower, but & and logical_and are about the same.

edited Aug 9, 2017 at 5:52

answered Sep 5, 2013 at 19:23

Mark Mikofski

20.3k2 gold badges61 silver badges94 bronze badges

4 Comments

DSM Over a year ago

You need to be a little careful about how you speak about what's evaluated. For example, in output = y[np.logical_and(x > 1, x < 5)], x < 5 is evaluated (possibly creating an enormous array), even though it's the second argument, because that evaluation happens outside of the function. IOW, logical_and gets passed two already-evaluated arguments. This is different from the usual case of a and b, in which b isn't evaluated if a is truelike.

jfs Over a year ago

there is no difference between bitwise_and() and logical_and() for boolean arrays

J.Massey Over a year ago

I've been searching ages for the 'or' alternative and this reply gave me some much needed relief! Thank you so much. (np.logical_or), OBVIOUSLY...

Mark Mikofski Over a year ago

@J.Massey a pipe | (aka np.bitwise_or) might also work, eg: (a < b) | (a > c)

Good Fit · Accepted Answer · 2018-12-26 20:24:32Z

25

Add one detail to @J.F. Sebastian's and @Mark Mikofski's answers:
If one wants to get the corresponding indices (rather than the actual values of array), the following code will do:

For satisfying multiple (all) conditions:

select_indices = np.where( np.logical_and( x > 1, x < 5) )[0] #   1 < x <5

For satisfying multiple (or) conditions:

select_indices = np.where( np.logical_or( x < 1, x > 5 ) )[0] # x <1 or x >5

edited Dec 26, 2018 at 20:24

answered Nov 18, 2014 at 16:03

Good Fit

1,32617 silver badges11 bronze badges

1 Comment

calavicci Over a year ago

Note that numpy.where will not just return an array of the indices, but will instead return a tuple (the output of condition.nonzero()) containing arrays - in this case, (the array of indices you want,), so you'll need select_indices = np.where(...)[0] to get the result you want and expect.

score 6 · Accepted Answer · 2017-11-16 21:02:37Z

6

I like to use np.vectorize for such tasks. Consider the following:

>>> # Arrays
>>> x = np.array([5, 2, 3, 1, 4, 5])
>>> y = np.array(['f','o','o','b','a','r'])

>>> # Function containing the constraints
>>> func = np.vectorize(lambda t: t>1 and t<5)

>>> # Call function on x
>>> y[func(x)]
>>> array(['o', 'o', 'a'], dtype='<U1')

The advantage is you can add many more types of constraints in the vectorized function.

Hope it helps.

edited Nov 16, 2017 at 21:02

answered Nov 9, 2017 at 6:45

user4340135

1 Comment

Alex Riley Over a year ago

This is not a good way to do indexing in NumPy (it will be very slow).

Sᴀᴍ Onᴇᴌᴀ · Accepted Answer · 2017-06-29 23:56:36Z

1

Actually I would do it this way:

L1 is the index list of elements satisfying condition 1;(maybe you can use somelist.index(condition1) or np.where(condition1) to get L1.)

Similarly, you get L2, a list of elements satisfying condition 2;

Then you find intersection using intersect(L1,L2).

You can also find intersection of multiple lists if you get multiple conditions to satisfy.

Then you can apply index in any other array, for example, x.

edited Jun 29, 2017 at 23:56

Sᴀᴍ Onᴇᴌᴀ

8,3028 gold badges38 silver badges61 bronze badges

answered Jun 29, 2017 at 23:49

Shuo Yang

1311 gold badge1 silver badge5 bronze badges

Comments

Gautam Sreekumar · Accepted Answer · 2019-02-08 21:56:22Z

0

For 2D arrays, you can do this. Create a 2D mask using the condition. Typecast the condition mask to int or float, depending on the array, and multiply it with the original array.

In [8]: arr
Out[8]: 
array([[ 1.,  2.,  3.,  4.,  5.],
       [ 6.,  7.,  8.,  9., 10.]])

In [9]: arr*(arr % 2 == 0).astype(np.int) 
Out[9]: 
array([[ 0.,  2.,  0.,  4.,  0.],
       [ 6.,  0.,  8.,  0., 10.]])

answered Feb 8, 2019 at 21:56

Gautam Sreekumar

5642 gold badges9 silver badges25 bronze badges

Collectives™ on Stack Overflow

How do I select elements of an array given condition?

6 Answers 6

6 Comments

4 Comments

1 Comment

1 Comment

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

6 Answers 6

6 Comments

4 Comments

1 Comment

1 Comment

Comments

Comments

Linked

Related