2

I have a large (90k x 90k) numpy ndarray and I need to zero out a block of it. I have a list of about 30k indices that indicate which rows and columns need to be zero. The indices aren't necessarily contiguous, so a[min:max, min:max] style slicing isn't possible.

As a toy example, I can start with a 2D array of non-zero values, but I can't seem to write zeros the way I expect.

import numpy as np

a = np.ones((6, 8))
indices = [2, 3, 5]
# I thought this would work, but it does not.
# It correctly writes to (2,2), (3,3), and (5,5), but not all
# combinations of (2, 3), (2, 5), (3, 2), (3, 5), (5, 2), or (5, 3)
a[indices, indices] = 0.0
print(a)

[[1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 0. 1. 1. 1. 1. 1.]
 [1. 1. 1. 0. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 0. 1. 1.]]
# I thought this would fix that problem, but it doesn't change the array.
a[indices, :][:, indices] = 0.0
print(a)

[[1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1.]]

In this toy example, I'm hoping for this result.

[[1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 0. 0. 1. 0. 1. 1.]
 [1. 1. 0. 0. 1. 0. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 0. 0. 1. 0. 1. 1.]]

I could probably write a cumbersome loop or build some combinatorically huge list of indices to do this, but it seems intuitive that this must be supported in a cleaner way, I just can't find the syntax to make it happen. Any ideas?

2
  • 2
    Explore using np.ix_. Commented Jan 16 at 23:09
  • @hpaulj, I had never seen this function, and wouldn't have known to search for it. And it's exactly what I need. Thanks!! I wrote an answer, but if you'd rather take the credit, I'll delete mine. Commented Jan 16 at 23:19

1 Answer 1

3

Based on hpaulj's comment, I came up with this, which works perfectly on the toy example.

a[np.ix_(indices, indices)] = 0.0
print(a)

[[1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 0. 0. 1. 0. 1. 1.]
 [1. 1. 0. 0. 1. 0. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 0. 0. 1. 0. 1. 1.]]

It also worked beautifully on the real data. It was faster than I expected and didn't noticeably increase memory consumption. Exhausting memory has been a constant concern with these giant arrays.

Sign up to request clarification or add additional context in comments.

1 Comment

Look at the result of np.ix_(indices, indices). It is easy to create the same two arrays without ix_. Ir's a convenience function, not a necessity. Those 2 arrays are broadcastable.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.