2

Is there a way to get the specific range of array from the results of numpy.random.normal()? without computing all the random numbers, it only computes the said range limits

Normal application

random_numbers = numpy.random.normal(0, 1, 1000)

What i want is get the range of this random_numbers without computing it all first

first_100_random_numbers = needs the results of the first 100 values
300th_400th_random_numbers = needs the results of the 300 - 400 values
5
  • why "without computing it all first" ? Commented Jun 27, 2019 at 5:37
  • For memory error not to occur Commented Jun 27, 2019 at 5:42
  • In my case, the size will be billions of data Commented Jun 27, 2019 at 5:43
  • The current architecture of the application is chunking large scale data and solving the range of the chunked data. But I can't find a solution in chunking the numpy.random.normal results Commented Jun 27, 2019 at 5:47
  • Already done the chunking in numpy.interpolate and numpy.linspace. But don't have any idea in numpy.random.normal Commented Jun 27, 2019 at 5:48

2 Answers 2

1

If you generate the random numbers one at a time, you can just keep track of whether they increase the max or min values. You will still have to compute the values, but you won't run into a memory issue since you only have to store three numbers (max, min, and latest_random)

import numpy as np
max_=0
min_=0
for i in range(1000):
    new_number=np.random.normal(0,1,1)
    if new_number>max_:
        max_=new_number
    if new_number<min_:
        min_=new_number
range_=max_-min_
print(range_)

To speed up the computation you can do larger blocks at a time. If you want to do a run with a billion numbers, you can calculate a million at a time and run the loop a thousand times. Modified code and time results below

import numpy as np
import time
max_=0
min_=0
start=time.time()
for i in range(1000):
    new_array=np.random.normal(0,1,1000000)
    new_max=np.max(new_array)
    new_min=np.min(new_array)
    if new_max>max_:
        max_=new_max
    if new_min<min_:
        min_=new_min
range_=max_-min_
print('Range ', range_)
end = time.time()
Time=end - start
print('Time ',Time)


Range 12.421138327443614
Time  36.7797749042511

Comparing the results of running one random number at a time vs. ten at a time to see if results are significantly different (each one run three times)

One at a time:

new_numbers=[]
for i in range(10):
    new_numbers.append(np.random.normal(0,1,1)[0])
print(new_numbers)
[-1.0145267697638918, -1.1291506481372602, 1.3622608858856742, 0.16024562390261188, 1.062550043104352, -0.4160329548439351, -0.05464203711515494, -0.7416629430695286, 0.35066071936940363, 0.06498345663995017]
[-1.5632632129838873, -1.0314300796946991, 0.5014408178125339, -0.37806631815396563, 0.45396918178048334, -0.6630479858064194, -0.47097483551189306, 0.40734077106402056, 1.1167819302886144, -0.6594075991871857]
[0.4448783416507262, 0.20160041940565818, -0.4781753245124433, -0.7130750653981222, -0.8035305391034386, -0.41543648761183466, 0.25166027175788847, -0.7051417978559822, 0.6017351178904993, -1.3719596304190458]

Ten at a time:

np.random.normal(0,1,10)
array([-1.79498658,  0.89073416, -0.25302627, -0.17237986, -0.38988131,
       -0.93635678,  0.28824899,  0.52675642,  0.86195635, -0.89584341])
array([ 1.41602405,  1.33800937,  1.87837334,  0.2082182 , -0.25116545,
        1.37953259,  0.34445565, -0.33647043, -0.24414261, -0.14505838])
array([ 0.43848371, -0.60967936,  1.2902231 ,  0.44589728, -2.39725248,
       -1.42715386, -1.0627627 ,  1.15998483,  0.96427742, -2.01062938])
Sign up to request clarification or add additional context in comments.

7 Comments

sorry, can you please explain further?
Rather than calculating all random numbers at once in one line e.g. np.random.normal(0,1,1000), you write a for loop to calculate one random variable a thousand times over. During this loop, you keep track of the max and min value generated. At first these start at zero, but if the new number is greater than the max, or lower than the min, the max or min is redefined as the new number. this way, when the loop is finished, you have stored the highest and lowest number generated, without saving every number in memory
Can't do this, since the scale in np.random.normal is the shape of the results. if it set to 1, then any number that is closed to mean = 0, will be its results. So the output of this will be inaccurate vs the np.random.normal(0,1,1000)
In your example above you used 1 as the scale. When I run a for loop with np.random.normal(0,1,1) versus np.random.normal(0,1,10) I get very similar results. Is there something I'm missing?
Edited my answer to address both of your concerns. If you can handle a million numbers in memory then it only takes 36 seconds to run. There are probably improvements you could make to this, but I think I've showed the point
|
1

maybe just draw them from a np.random.RandomState:

import numpy as np

# random state
RS = np.random.RandomState(seed = 0) 

# first 10 elments
print(RS.normal(0, 1, 10))

# another 20
print(RS.normal(0, 1, 20))

Its allays going to be the same random numbers to the according seed.

first_100_random_numbers = RS.normal(0, 1, 100)
100th_200th_random_numbers = RS.normal(0, 1, 100)
200th_400th_random_numbers = RS.normal(0, 1, 200)

Otherwise you could think about using a generator.

5 Comments

I used numpy.random.normal for the sole purpose of central limit theory. The question is how to generate a large size scale of numpy.random.normal.
Already solved memory error for numpy.linspace and numpy.interpolate. but having some problem in numpy.random.normal.
Sorry i dont understand your problem. See the small edit above and maybe you can tell me why its not satisfying your needs.
The way you do it is computing first all the random generated numbers. In my case the size is over billion. with that memory error will occur. since the computing capacity exceeds the limits.
What i want to know is a work around for the memory error.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.