0

I am trying to randomly generate numbers from a distribution. The numbers that do not fall within two standard deviations of the mean I want to replace, so that in the end all of the numbers in the array fall within this range. This is the code that I have so far:

mean = 150
COV = 0.4
sd = COV*mean
upper_limit = mean + 2*sd
lower_limit = mean - 2*sd
capacity = np.random.normal(mean, sd, size = (1,96))
for x in capacity:
    while x > upper_limit:
        x = np.random.normal(mean, sd, size = 1)
    while x < lower_limit:
        x = np.random.normal(mean, sd, size = 1)

However, I get the error message ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
Can anyone help with how to fix this?

2
  • 1
    Is it that capacity is a matrix, so each x in capacity is a row, but you're treating it like an individual element? Commented Jan 7, 2021 at 17:38
  • stackoverflow.com/questions/18441779/… Commented Jan 7, 2021 at 17:41

3 Answers 3

1

Don't iterate through a numpy array to do something on each element of the array. The whole point of using numpy is to make this faster by never iterating.

To check all values in capacity which are greater than upper_limit, just do this:

capacity > upper_limit

Then, you can get the indices of those items this way:

too_high_indices = np.where(capacity > upper_limit)

Then, you can generate a new random array to assign to all such, e.g.

capacity[too_high_indices] = np.random.normal(mean, sd, size=len(too_high_indices))

In the end, you do this:

too_high_indices = np.where(capacity > upper_limit)
while np.any(too_high_indices):
    capacity[too_high_indices] = np.random.normal(
        mean, sd, size=len(too_high_indices))
    too_high_indices = np.where(capacity > upper_limit)

Then repeat for the lower limit.

This way, it will be relatively fast even if the size grows.

Sign up to request clarification or add additional context in comments.

Comments

1

I think you should change the size parameter from (1, 96) to 96. Because here your x has shape (96,) so is an array and thus not comparable to a single float value.

2 Comments

just validate my answer then plz :) Glad it helped
I just realized that although changing the size allowed the code to run, it did not replace any values outside of two standard deviations away
0
# print(capacity)
# changed = set([])
for i in range( len(capacity[0]) ):
    while capacity[0][i] > upper_limit or capacity[0][i] < lower_limit:
        capacity[0][i] = np.random.normal(mean, sd, size = 1)[0]
        # changed.add(i)
# print(capacity)
# print(changed)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.