I am running n instances of the same code in parallel and want each instance to use independent random numbers.
For this purpose, before I start the parallel computations I create a list of random states, like this:
import numpy.random as rand
rand_states = [(rand.seed(rand.randint(2**32-1)),rand.get_state())[1] for j in range(n)]
I then pass one element of rand_states to each parallel process, in which I basically do
rand.set_state(rand_state)
data = rand.rand(10,10)
To make things reproducible, I run np.random.seed(0) at the very beginning of everything.
Does this work like I hope it does? Is this the proper way to achieve it?
(I cannot just store the data arrays themselves beforehand, because (i) there are a lot of places where random numbers are generated in the parallel processes and (ii) that would introduce unnecessary logic coupling between the parallel code and the managing nonparallel code and (iii) in reality I run M slices across N<M processors and the data for all M slices is too big to store)
RandomStateobjects?