I have an array of random numbers. I want to change only some of the elements based on a probability of say 0.07. Currently I am doing this using a for loop to iterate over every element. Is there a better way of doing this?
2 Answers
If the array in question is called a, you can select an average proportion of 0.07 of its values by
a[numpy.random.rand(*a.shape) < 0.07]
I don't know how you want to change these values. To multiply them by two, just do
a[numpy.random.rand(*a.shape) < 0.07] *= 2.0
7 Comments
random.sample(xrange(len(a)),int(len(a)*0.07)) as indices, or so, modulo rounding.Sven's answer is elegant. However, it is much faster to pick the elements you want to change with
n = numpy.random.binomial(len(a), 0.07)
a[numpy.random.randint(0, len(a), size=n)] *= 2.0
The first expression determines how many elements you want to sample (n is an integer between 0 and len(a), but on average 0.07), the second generates exactly the number of indices you want to retrieve. (Note, however, that you might get the same index several times.)
The difference to
a[numpy.random.rand(len(a)) < p]
becomes small as p approaches 1, but for small p, it might be a factor of 10 or more.