Numpy Conditionally Replace Column Elements

Question

So I already took a look at this question.

I know you can conditionally replace a single column, but what about multiple columns? When I tried it, it doesn't seem to work.

the_data = np.array([[0, 1, 1, 1],
                     [0, 1, 3, 1],
                     [3, 4, 1, 3],
                     [0, 1, 2, 0],
                     [2, 1, 0, 0]])

the_data[:,0][the_data[:,0] == 0] = -1 # this works

columns_to_replace = [0, 1, 3]
the_data[:,columns_to_replace][the_data[:,columns_to_replace] == 0] = -1 # this does not work

I initially thought that the second case doesn't work because I thought the_data[:,columns_to_replace] creates a copy instead of directly referencing the elements. However, if that were the case, then the first case shouldn't work either, when you are only replacing the single column.

Community · Accepted Answer · 2020-06-20 09:12:55Z

2

You're indeed getting a copy because you're using advanced indexing:

Advanced indexing is triggered when the selection object, obj, is a non-tuple sequence object, an ndarray (of data type integer or bool), or a tuple with at least one sequence object or ndarray (of data type integer or bool). There are two types of advanced indexing: integer and Boolean.

Advanced indexing always returns a copy of the data (contrast with basic slicing that returns a view).

(Taken from the docs)

The first part works because it uses basic slicing.

I think you can do this without copying, but still with some memory overhead:

columns_to_replace = [0, 1, 3]

mask = np.zeros(the_data.shape, bool) # don't use too much memory
mask[:, columns_to_replace] = 1

np.place(the_data, (the_data == 0) * mask, [-1]) # this doesn't copy anything

edited Jun 20, 2020 at 9:12

CommunityBot

11 silver badge

answered Aug 9, 2018 at 21:05

ForceBru

45k10 gold badges71 silver badges104 bronze badges

Sign up to request clarification or add additional context in comments.

8 Comments

user3426943 Over a year ago

Is there a workaround to this that doesn't invoke a copy? Or is looping the only option?

ForceBru Over a year ago

@user3426943, looks like there is a workaround. Please see my edit.

ForceBru Over a year ago

@user3483203, it still allocates memory, it just doesn't initialize it (and you'll have to fill it in anyway, with ones and zeros).

hpaulj Over a year ago

the_data[(the_data==0)*mask]=-1 should also work. The key is to find a mask that combines the columns criteria and the 0's test, as you do with the boolean *.

ForceBru Over a year ago

@hpaulj, and it does work, indeed, and it also looks simpler. I was searching for a solution that would very clearly say: "I'm not creating a copy instead of a view", so my solution's a bit too verbose about that.

|

Collectives™ on Stack Overflow

Numpy Conditionally Replace Column Elements

1 Answer 1

8 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

8 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related