40

Is it possible to convert an array of indices to an array of ones and zeros, given the range? i.e. [2,3] -> [0, 0, 1, 1, 0], in range of 5

I'm trying to automate something like this:

>>> index_array = np.arange(200,300)
array([200, 201, ... , 299])

>>> mask_array = ???           # some function of index_array and 500
array([0, 0, 0, ..., 1, 1, 1, ... , 0, 0, 0])

>>> train(data[mask_array])    # trains with 200~299
>>> predict(data[~mask_array]) # predicts with 0~199, 300~499
5
  • scipy has a masked array module. It is related to the question. docs.scipy.org/doc/numpy/reference/maskedarray.html Commented Sep 3, 2014 at 22:57
  • 1
    [x in index_array for x in range(500)] sort of does it, but with True and False instead of 1 and 0. Commented Sep 3, 2014 at 23:03
  • @genisage Can you please make your comment as an answer? I want to choose yours. It's the exact thing I was looking for. Thank you for the answer! Commented Sep 4, 2014 at 4:47
  • numpy.array([boolean_value in indices for x in range(length)], dtype=np.int8) would work for 1D arrays Commented Dec 20, 2018 at 2:21
  • Not sure, if aligns directly to the question asked above but have you explored numpy masked_array docs.scipy.org/doc/numpy-1.13.0/reference/generated/… in-case it helps with further exploration Commented Mar 27, 2019 at 19:38

4 Answers 4

49

Here's one way:

In [1]: index_array = np.array([3, 4, 7, 9])

In [2]: n = 15

In [3]: mask_array = np.zeros(n, dtype=int)

In [4]: mask_array[index_array] = 1

In [5]: mask_array
Out[5]: array([0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0])

If the mask is always a range, you can eliminate index_array, and assign 1 to a slice:

In [6]: mask_array = np.zeros(n, dtype=int)

In [7]: mask_array[5:10] = 1

In [8]: mask_array
Out[8]: array([0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0])

If you want an array of boolean values instead of integers, change the dtype of mask_array when it is created:

In [11]: mask_array = np.zeros(n, dtype=bool)

In [12]: mask_array
Out[12]: 
array([False, False, False, False, False, False, False, False, False,
       False, False, False, False, False, False], dtype=bool)

In [13]: mask_array[5:10] = True

In [14]: mask_array
Out[14]: 
array([False, False, False, False, False,  True,  True,  True,  True,
        True, False, False, False, False, False], dtype=bool)
Sign up to request clarification or add additional context in comments.

3 Comments

+1 This is a very nice answer too, especially if someone wants their mask_array to be an np.array.
And it is much more efficient than the list comprehension.
Is there any advantage to using int instead of bool? I'm just wondering why the top part of the answer doesn't recommend bool when the question is asking for a mask.
14

For a single dimension, try:

n = (15,)
index_array = [2, 5, 7]
mask_array = numpy.zeros(n)
mask_array[index_array] = 1

For more than one dimension, convert your n-dimensional indices into one-dimensional ones, then use ravel:

n = (15, 15)
index_array = [[1, 4, 6], [10, 11, 2]] # you may need to transpose your indices!
mask_array = numpy.zeros(n)
flat_index_array = np.ravel_multi_index(
    index_array,
    mask_array.shape)
numpy.ravel(mask_array)[flat_index_array] = 1

Comments

2

There's a nice trick to do this as a one-liner, too - use the numpy.in1d and numpy.arange functions like this (the final line is the key part):

>>> x = np.linspace(-2, 2, 10)
>>> y = x**2 - 1
>>> idxs = np.where(y<0)

>>> np.in1d(np.arange(len(x)), idxs)
array([False, False, False,  True,  True,  True,  True, False, False, False], dtype=bool)

The downside of this approach is that it's ~10-100x slower than the appropch Warren Weckesser gave... but it's a one-liner, which may or may not be what you're looking for.

1 Comment

Isn't the in1d() method far much expansive that the other proposes solutions ?
1

As requested, here it is in an answer. The code:

[x in index_array for x in range(500)]

will give you a mask like you asked for, but it will use Bools instead of 0's and 1's.

2 Comments

This was the answer that op orignally marked. But marking it made other people downvote to like -3, so I had to change my mark...
This one is really slow: not only is it not vectorized, but it's also O(n²).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.