3

I am trying to modify the values of an array based on a subset of a subset, but I can't find a way to do this. I think this exposes my lack of understanding about exactly how array indexing and subsetting works and what views are, but I can't find a solution anywhere so I am hoping that someone can help me.

Example problem:

import numpy as np

#generate some simple data
MyArray=np.arange(20).reshape(4,5)

>>>MyArray
array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])

#subset 1
i=np.where(MyArray > 5)

#subset of that subset
j=np.where(MyArray[i] < 15)

>>>MyArray[i][j]
array([ 6,  7,  8,  9, 10, 11, 12, 13, 14])

Great, that's what I expected to see! But if I now want to change those values to something else, I can't:

>>>MyArray[i][j]=999
>>>MyArray[i][j]
array([ 6,  7,  8,  9, 10, 11, 12, 13, 14])

#hmmmm :(

An ugly solution that works

I CAN get the values to change by looping over the elements of j individually, but this seems extraordinarily clumsy & hard to read:

#get number of elements in j
nj=np.size(j)

#loop over each element of j and modify the corresponding ith element 
of MyArray 
for j_it in range(0,nj):
    MyArray[i[0][j[0][j_it]]][i[1][j[0][j_it]]]=999

>>>MyArray
array([[  0,   1,   2,   3,   4],
       [  5, 999, 999, 999, 999],
       [999, 999, 999, 999, 999],
       [ 15,  16,  17,  18,  19]])

Similarly, I can modify MyArray using just one level of subsetting:

ii=np.where((MyArray > 5) & (MyArray < 15))
MyArray[ii]=999
>>>MyArray
array([[  0,   1,   2,   3,   4],
       [  5, 999, 999, 999, 999],
       [999, 999, 999, 999, 999],
       [ 15,  16,  17,  18,  19]])

So, where am I going wrong with the first example?

NOTE:- I know that my last solution works fine for this problem, but my actual problem necessarily involves far more data and the second level of subsetting (and possibly a third...)

Thanks in advance, and apologies if this is something simple that I really should be able to work out fo myself from the documentation: I just haven't been able to :(

4 Answers 4

3

As you said, your last example solves your problem. The issue you're missing is that calling MyArray[i] creates a new array which you are then indexing again with MyArray[i][j]. When you attempt to assign to the result of this subset, you aren't actually assigning into MyArray.

To do this similar to your first example, you would want to do everything in one operation like this:

import numpy as np

#generate some simple data
MyArray=np.arange(20).reshape(4,5)

>>>MyArray
MyArray([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])

#subset 1
i=np.where(MyArray > 5)

#subset of that subset
j=np.where(MyArray[i] < 15)

# Mask array subset
MyArray[i[0][j[0]], i[1][j[0]]] = 999

>>>MyArray
MyArray([[  0,   1,   2,   3,   4],
   [  5, 999, 999, 999, 999],
   [999, 999, 999, 999, 999],
   [ 15,  16,  17,  18,  19]])

Basically, i and j are both tuples containing arrays indicating which indexes were matched in the where statement. i contains two arrays, one for each dimensions in MyArray. j contains one array for the single dimension of the arrays contained in i. You want to get the j[0] elements of both of the arrays in i.

Hopefully that makes sense. Let me know if you have questions.

Sign up to request clarification or add additional context in comments.

1 Comment

That makes lots of sense and thanks very much for the explanation. Just to note, for anybody else who comes across this, this solution also works if subset j is calculated from a different array (MyArray2) so long as its dimensions are the same as MyArray (so that MyArray2[i] provides the same subset as MyArray[i]). Thanks!
1

Try creating boolean arrays:

condition_1 = MyArray > 5
condition_2 = MyArray < 15

Then you can use bitwise &:

bools = condition1 & condition2

which will have values like:

[[ False,  False,  False,  False,  False],
   [ True,  True,  True,  ...]...]

If such array has the same lengths as the array you want to change data in, you can use it as you use indexes. But here instead of indexes you have True or False based on if a cell meet your conditions.

MyArray[bools] = 999

Comments

1

maskedarray looks like it has the right set of tools

import numpy.ma as ma


MyArray=np.arange(20).reshape(4,5)

subma = ma.masked_inside(MyArray, 5, 15)  # lots of ma. logic, arithematic ops

subma.filled(999)

Out[44]: 
array([[  0,   1,   2,   3,   4],
       [999, 999, 999, 999, 999],
       [999, 999, 999, 999, 999],
       [999,  16,  17,  18,  19]])

1 Comment

Thanks for the reply. I can see that this solution works for my example, but what about if I had another array, MyArray2, that I am creating the second subset (j) from? i.e: MyArray=np.arange(2).reshape(4,5) \ MyArray2=np.arange(20)*10.reshape(4,5) \ i=np.where(MyArray > 5) \ j=np.where(MyArray2[i] < 100) Perhaps I should have been clearer in my question, but I want to be able to modify MyArray based on subset j, which I don't think (although I could be wrong) the maskedarray approach lets me do?
1

I think the other answer from @Vorticity is clearly correct and a well thought out explanation, I feel that a different piece of code is significantly more readable and understandable.

From https://docs.scipy.org/doc/numpy/reference/generated/numpy.where.html we can tell np what values to use after our boolean in the first arg to np.where gets evaluated.

>>>np.where((MyArray > 5) & (MyArray < 15), MyArray, 999)

array([[999, 999, 999, 999, 999],
   [999,   6,   7,   8,   9],
   [ 10,  11,  12,  13,  14],
   [999, 999, 999, 999, 999]])

That being said, a good example of the "not assigning to MyArray issue would be the following:

>>> MyArray
array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])
>>> MyArray = MyArray[i]
>>> MyArray
array([ 6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19])
>>> MyArray = MyArray[j]
>>> MyArray
array([ 6,  7,  8,  9, 10, 11, 12, 13, 14])

I think the main thing that trips a lot of people (myself included) is that just calling MyArray[i] without assigning it to something with = doesn't make the computer hold onto what the result of that less than or greater than comparison was.

2 Comments

Thanks for taking the time to post an alternative solution and some helpful clarification on the problem. Regarding your solution, I am a little unsure what you mean by test on the first line? Something that I have found that works, following your solution, is: MyArray[i]=np.where(MyArray[i] < 15, 999, MyArray[i]), which sets those elements of MyArray[i] that are less than 15 to 999, and leaves the rest unchanged.
oh lol sorry when I was recreating your situation in IDLE i made my array as "test" not "MyArray." It is just a typo. I will edit appropriately.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.