How to get the specific range values from numpy.random.normal()?

Question

Is there a way to get the specific range of array from the results of numpy.random.normal()? without computing all the random numbers, it only computes the said range limits

Normal application

random_numbers = numpy.random.normal(0, 1, 1000)

What i want is get the range of this random_numbers without computing it all first

first_100_random_numbers = needs the results of the first 100 values
300th_400th_random_numbers = needs the results of the 300 - 400 values

The current architecture of the application is chunking large scale data and solving the range of the chunked data. But I can't find a solution in chunking the numpy.random.normal results — Arben John Avillanosa
– Arben John Avillanosa, Commented Jun 27, 2019 at 5:47
Already done the chunking in numpy.interpolate and numpy.linspace. But don't have any idea in numpy.random.normal — Arben John Avillanosa
– Arben John Avillanosa, Commented Jun 27, 2019 at 5:48

Novice · Accepted Answer · 2019-06-28 08:24:28Z

1

If you generate the random numbers one at a time, you can just keep track of whether they increase the max or min values. You will still have to compute the values, but you won't run into a memory issue since you only have to store three numbers (max, min, and latest_random)

import numpy as np
max_=0
min_=0
for i in range(1000):
    new_number=np.random.normal(0,1,1)
    if new_number>max_:
        max_=new_number
    if new_number<min_:
        min_=new_number
range_=max_-min_
print(range_)

To speed up the computation you can do larger blocks at a time. If you want to do a run with a billion numbers, you can calculate a million at a time and run the loop a thousand times. Modified code and time results below

import numpy as np
import time
max_=0
min_=0
start=time.time()
for i in range(1000):
    new_array=np.random.normal(0,1,1000000)
    new_max=np.max(new_array)
    new_min=np.min(new_array)
    if new_max>max_:
        max_=new_max
    if new_min<min_:
        min_=new_min
range_=max_-min_
print('Range ', range_)
end = time.time()
Time=end - start
print('Time ',Time)


Range 12.421138327443614
Time  36.7797749042511

Comparing the results of running one random number at a time vs. ten at a time to see if results are significantly different (each one run three times)

One at a time:

new_numbers=[]
for i in range(10):
    new_numbers.append(np.random.normal(0,1,1)[0])
print(new_numbers)
[-1.0145267697638918, -1.1291506481372602, 1.3622608858856742, 0.16024562390261188, 1.062550043104352, -0.4160329548439351, -0.05464203711515494, -0.7416629430695286, 0.35066071936940363, 0.06498345663995017]
[-1.5632632129838873, -1.0314300796946991, 0.5014408178125339, -0.37806631815396563, 0.45396918178048334, -0.6630479858064194, -0.47097483551189306, 0.40734077106402056, 1.1167819302886144, -0.6594075991871857]
[0.4448783416507262, 0.20160041940565818, -0.4781753245124433, -0.7130750653981222, -0.8035305391034386, -0.41543648761183466, 0.25166027175788847, -0.7051417978559822, 0.6017351178904993, -1.3719596304190458]

Ten at a time:

np.random.normal(0,1,10)
array([-1.79498658,  0.89073416, -0.25302627, -0.17237986, -0.38988131,
       -0.93635678,  0.28824899,  0.52675642,  0.86195635, -0.89584341])
array([ 1.41602405,  1.33800937,  1.87837334,  0.2082182 , -0.25116545,
        1.37953259,  0.34445565, -0.33647043, -0.24414261, -0.14505838])
array([ 0.43848371, -0.60967936,  1.2902231 ,  0.44589728, -2.39725248,
       -1.42715386, -1.0627627 ,  1.15998483,  0.96427742, -2.01062938])

edited Jun 28, 2019 at 8:24

answered Jun 27, 2019 at 8:28

Novice

9151 gold badge8 silver badges18 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

Arben John Avillanosa Over a year ago

sorry, can you please explain further?

Novice Over a year ago

Rather than calculating all random numbers at once in one line e.g. np.random.normal(0,1,1000), you write a for loop to calculate one random variable a thousand times over. During this loop, you keep track of the max and min value generated. At first these start at zero, but if the new number is greater than the max, or lower than the min, the max or min is redefined as the new number. this way, when the loop is finished, you have stored the highest and lowest number generated, without saving every number in memory

Arben John Avillanosa Over a year ago

Can't do this, since the scale in np.random.normal is the shape of the results. if it set to 1, then any number that is closed to mean = 0, will be its results. So the output of this will be inaccurate vs the np.random.normal(0,1,1000)

Novice Over a year ago

In your example above you used 1 as the scale. When I run a for loop with np.random.normal(0,1,1) versus np.random.normal(0,1,10) I get very similar results. Is there something I'm missing?

Novice Over a year ago

Edited my answer to address both of your concerns. If you can handle a million numbers in memory then it only takes 36 seconds to run. There are probably improvements you could make to this, but I think I've showed the point

|

some_name.py · Accepted Answer · 2019-06-27 09:37:29Z

1

maybe just draw them from a np.random.RandomState:

import numpy as np

# random state
RS = np.random.RandomState(seed = 0) 

# first 10 elments
print(RS.normal(0, 1, 10))

# another 20
print(RS.normal(0, 1, 20))

Its allays going to be the same random numbers to the according seed.

first_100_random_numbers = RS.normal(0, 1, 100)
100th_200th_random_numbers = RS.normal(0, 1, 100)
200th_400th_random_numbers = RS.normal(0, 1, 200)

Otherwise you could think about using a generator.

edited Jun 27, 2019 at 9:37

answered Jun 27, 2019 at 8:00

some_name.py

8377 silver badges17 bronze badges

5 Comments

Arben John Avillanosa Over a year ago

I used numpy.random.normal for the sole purpose of central limit theory. The question is how to generate a large size scale of numpy.random.normal.

Arben John Avillanosa Over a year ago

Already solved memory error for numpy.linspace and numpy.interpolate. but having some problem in numpy.random.normal.

some_name.py Over a year ago

Sorry i dont understand your problem. See the small edit above and maybe you can tell me why its not satisfying your needs.

Arben John Avillanosa Over a year ago

The way you do it is computing first all the random generated numbers. In my case the size is over billion. with that memory error will occur. since the computing capacity exceeds the limits.

Arben John Avillanosa Over a year ago

What i want to know is a work around for the memory error.

Collectives™ on Stack Overflow

How to get the specific range values from numpy.random.normal()?

2 Answers 2

7 Comments

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

7 Comments

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related