3

I come to a problem like this: suppose I have arrays like this: a = np.array([[1,2,3,4,5,4,3,2,1],]) label = np.array([[1,0,1,0,0,1,1,0,1],]) I need to obtain the indices of a at which position the element value of label is 1 and the value of a is the largest amount all that causing label to be 1.

It maybe confusing, in the above example, the indices where label is 1 are: 0, 2, 5, 6, 8, their corresponding values of a are thus: 1, 3, 4, 3, 1, among which 4 is the larges, thus I need to get the result of 5 which is the index of number 4 in a. How could I do this with numpy ?

3 Answers 3

3

Get the 1s indices say as idx, then index into a with it, get max index and finally trace it back to the original order by indexing into idx -

idx = np.flatnonzero(label==1)
out = idx[a[idx].argmax()]

Sample run -

# Assuming inputs to be 1D
In [18]: a
Out[18]: array([1, 2, 3, 4, 5, 4, 3, 2, 1])

In [19]: label
Out[19]: array([1, 0, 1, 0, 0, 1, 1, 0, 1])

In [20]: idx = np.flatnonzero(label==1)

In [21]: idx[a[idx].argmax()]
Out[21]: 5

For a as ints and label as an array of 0s and 1s, we could optimize further as we could scale a based on the range of values in it, like so -

(label*(a.max()-a.min()+1) + a).argmax()

Furthermore, if a has positive numbers only, it would simplify to -

(label*(a.max()+1) + a).argmax()

Timings for positive ints largish a -

In [115]: np.random.seed(0)
     ...: a = np.random.randint(0,10,(100000))
     ...: label = np.random.randint(0,2,(100000))

In [117]: %%timeit
     ...: idx = np.flatnonzero(label==1)
     ...: out = idx[a[idx].argmax()]
1000 loops, best of 3: 592 µs per loop

In [116]: %timeit (label*(a.max()-a.min()+1) + a).argmax()
1000 loops, best of 3: 357 µs per loop

# @coldspeed's soln
In [120]: %timeit np.ma.masked_where(~label.astype(bool), a).argmax()
1000 loops, best of 3: 1.63 ms per loop

# won't work with negative numbers in a
In [119]: %timeit (label*(a.max()+1) + a).argmax()
1000 loops, best of 3: 292 µs per loop

# @klim's soln (won't work with negative numbers in a)
In [121]: %timeit np.argmax(a * (label == 1))
1000 loops, best of 3: 229 µs per loop
Sign up to request clarification or add additional context in comments.

1 Comment

Nice answer! :-)
1

You can use masked arrays:

>>> np.ma.masked_where(~label.astype(bool), a).argmax()
5

Comments

1

Here is one of the simplest ways.

>>> np.argmax(a * (label == 1))
5
>>> np.argmax(a * (label == 1), axis=1)
array([5])

Coldspeed's method may take more time.

3 Comments

What if there are negative numbers in a?
Divakar. Yes this will not work if there are negatives numbers in a and if there is no match in label.
Still pretty fast under the constraints, going by the timings just added.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.