Python Numpy: Random number in a loop

Question

I have such code and use Jupyter-Notebook

for j in range(timesteps):    
    a_int = np.random.randint(largest_number/2) # int version

and i get random numbers, but when i try to move part of code to the functions, i start to receive same number in each iteration

def create_train_data():        
    np.random.seed(seed=int(time.time()))     
    a_int = np.random.randint(largest_number/2) # int version
    return a

for j in range(timesteps):    
    c = create_train_data()

Why it's happend and how to fix it? i think maybe it because of processes in Jupyter-Notebook

This is because you're using Jupyter which caches results

ninesalt
– ninesalt

2018-09-18 16:14:14 +00:00
Commented Sep 18, 2018 at 16:14 — ninesalt
– ninesalt, Commented Sep 18, 2018 at 16:14

Hans Musgrave · Accepted Answer · 2018-09-18 16:18:28Z

5

The offending line of code is

np.random.seed(seed=int(time.time()))

Since you're executing in a loop that completes fairly quickly, calling int() on the time reduces your random seed to the same number for the entire loop. If you really want to manually set the seed, the following is a more robust approach.

def create_train_data():   
    a_int = np.random.randint(largest_number/2) # int version
    return a

np.random.seed(seed=int(time.time()))
for j in range(timesteps):
    c = create_train_data()

Note how the seed is being created once and then used for the entire loop, so that every time a random integer is called the seed changes without being reset.

Note that numpy already takes care of a pseudo-random seed. You're not gaining more random results by using it. A common reason for manually setting the seed is to ensure reproducibility. You set the seed at the start of your program (top of your notebook) to some fixed integer (I see 42 in a lot of tutorials), and then all the calculations follow from that seed. If somebody wants to verify your results, the stochasticity of the algorithms can't be a confounding factor.

answered Sep 18, 2018 at 16:18

Hans Musgrave

7,2012 gold badges21 silver badges40 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Vladimir Shebuniayeu Over a year ago

No, you are not correct, i used before code without seed, that's why i put and didn't get result

Jeremy Over a year ago

@Vladimircape, when I comment out the random seed line I get random results

Vladimir Shebuniayeu Over a year ago

@Jeremy Hhh ,very strange. because when i also just make it in new default notebook, it was also, ok , but in my whole example ,it's not working, i will try to provide whole code

SRT HellKitty · Accepted Answer · 2018-09-18 16:25:39Z

0

The other answers are correct in saying that it is because of the seed. If you look at the Documentation From SciPy you will see that seeds are used to create a predictable random sequence. However, I think the following answer from another question regarding seeds gives a better overview of what it does and why/where to use it. What does numpy.random.seed(0) do?

answered Sep 18, 2018 at 16:25

SRT HellKitty

5973 silver badges10 bronze badges

Comments

Joooeey · Accepted Answer · 2018-09-18 18:05:13Z

0

Hans Musgrave's answer is great if you are happy with pseudo-random numbers. Pseudo-random numbers are good for most applications but they are problematic if used for cryptography.

The standard approach for getting one truly random number is seeding the random number generator with the system time before pulling the number, like you tried. However, as Hans Musgrave pointed out, if you cast the time to int, you get the time in seconds which will most likely be the same throughout the loop. The correct solution to seed the RNG with a time is:

def create_train_data():        
    np.random.seed()     
    a_int = np.random.randint(largest_number/2) # int version
    return a

This works because Numpy already uses the computer clock or another source of randomness for the seed if you pass no arguments (or None) to np.random.seed:

Parameters: seed : {None, int, array_like}, optional Random seed used to initialize the pseudo-random number generator. Can be any integer between 0 and 2**32 - 1 inclusive, an array (or other sequence) of such integers, or None (the default). If seed is None, then RandomState will try to read data from /dev/urandom (or the Windows analogue) if available or seed from the clock otherwise.

It all depends on your application though. Do note the warning in the docs:

Warning The pseudo-random generators of this module should not be used for security purposes. For security or cryptographic uses, see the secrets module.

edited Sep 18, 2018 at 18:05

answered Sep 18, 2018 at 17:08

Joooeey

4,1653 gold badges44 silver badges59 bronze badges

6 Comments

user2699 Over a year ago

No. Just no. The part of the docs you've included in your answer even states it: "optional Random seed used to initialize the pseudo-random number generator". No amount of cleverly chosen seeds will change the fact that you're using a pseudo random number generator.

Joooeey Over a year ago

You're only pulling a single number from the pseudo-random number generator though.

user2699 Over a year ago

Still no. The outputs depend entirely on the seed. If you have some source of numbers that are completely unpredictable to seed your pseudo random number generator with every time you need a single number then you don't actually need a pseudo random number generator.

Joooeey Over a year ago

Of course that's true. However, it's gonna be quite a huge amount of effort to implement this in a cross-platform manner, so why not use a library that does it for you. It's already implemented in the np.random library. That library has its weaknesses (/dev/urandom is not guaranteed to be random if used a lot) but it should be better than using the PRNG sequence.

user2699 Over a year ago

Well, the main reason (other than that being an ugly hack) would be the bit of documentation in np.random that states "Warning: The pseudo-random generators of this module should not be used for security purposes. For security or cryptographic uses, see the secrets module."

|

Collectives™ on Stack Overflow

Python Numpy: Random number in a loop

3 Answers 3

3 Comments

Comments

6 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

3 Comments

Comments

6 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related