Incoherence in complementary indices extracted from a np.array

Question

The problem is very simple, I have a vector of indices from which I want to extract one set randomly chosen and its complement. So I write the following code:

import numpy as np    
vec = np.arange(0,25000)
idx = np.random.choice(vec,5000)
idx_r = np.delete(vec,idx)

However, when I print the length of vec, idx, and idx_r they do not match. The sum between idx and idx_r return values higher than len(vec). For example, the following code:

print(len(idx))
print(len(idx_r))
print(len(idx_r)+len(idx))
print(len(vec))

returns:

5000 20462 25462 25000

Python version is 3.8.1 and GCC is 9.2.0.

user2357112 · Accepted Answer · 2020-05-02 21:12:31Z

0

The np.random.choice has a keyword argument replace. Its default value is True. If you set the value to False, I think you will get the desired result.

import numpy as np

vec = np.arange(0, 25000)

idx = np.random.choice(vec, 5000, replace=False)

idx_r = np.delete(vec, idx)

print([len(item) for item in (vec, idx, idx_r)])

Out:

[25000, 5000, 20000]

However, numpy.random.choice with replace=False is extremely inefficient due to poor implementation choices they're stuck with for backward compatibility - it generates a permutation of the whole input just to take a small sample. You should use the new Generator API instead, which doesn't have this issue:

rng = np.random.default_rng()

idx = rng.choice(vec, 5000, replace=False)

edited May 2, 2020 at 21:12

user2357112

286k32 gold badges490 silver badges571 bronze badges

answered May 2, 2020 at 21:04

dmmfll

2,8642 gold badges38 silver badges43 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

dmmfll Over a year ago

You're welcome. I'm just learning Numpy myself. Thanks for posting this. I didn't know about the methods you are using until now. Please mark it as the correct answer if it solved your issue.

Collectives™ on Stack Overflow

Incoherence in complementary indices extracted from a np.array

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related