Add control for random state #62

jpn-- · 2019-05-07T19:43:08Z

Are there any plans to include "random state" interface for any of the sampling functions (i.e. to guarantee stable output for testing)? Or: was this considered and explicitly not done for some reason?

If this isn't contra-indicated, would a pull request that implements this only partially be of interest? There's a lot of places where random state might be useful, and I don't have time to search through the whole codebase and find/implement them all, but I'd be willing to do so for a handful of "low hanging fruit" places that would be useful for the project I am working on. (I am going to do this anyhow, just want to know if I should isolate this as an independent PR)

quaquel · 2019-05-07T19:45:43Z

can't this be achieved by simply setting the numpy random seed globally?

I agree it would be a useful features

quaquel · 2019-05-07T19:54:57Z

So, I quickly checked the code. Sampling relies entirely on scipy.stats or numpy.random. Scipy.stats uses the global numpy random state, so setting the global numpy random state should be sufficient for reproducibility. Will try to test this Thursday.

Or do you want to control the RandomState of individual uncertain variables?

jpn-- · 2019-05-07T20:00:50Z

I was thinking for thread-safe operations, and to potentially regenerate individual random variables at a later time or on a different machine. This is pretty common for some of the other large scale simulators I work on, as there's a desire to have reproducible results not just in unit tests but in practical applications as well.

quaquel · 2019-05-07T20:04:08Z

I completely agree with the use case.

Thread safety is no concern for sampling: it only happens in the main thread. Farming out to other processes / threads happens after generating the experiments.

So the real question is whether having a global solution (probably via setting the numpy RandomState) is sufficient or whether you want to control individual distributions. The latter would require a modification to the Abstract Parameter class (init, and from_dist).

quaquel · 2019-05-08T06:04:50Z

So I ran a quick test using the inter temporal version of the lake problem and setting the global numpy random state. This works: it produces the exact same results. See the attached pdf for a quick proof of principle:
seed_test.pdf

Is this sufficient for your use case or not?

jpn-- · 2019-05-08T21:50:22Z

I've identified my real problem after pulling through the depths of the code -- and it turns out my real problems are with Platypus and not workbench. I don't have any desire to put in the effort to change that code as well, so I'm going to call this solution is good enough.

For posterity should any other users need to address this issue in the future:

At least part of the issue is that platypus uses the Python standard library random number generator, while the workbench uses the numpy random number generator. If you want any hope of generating more or less reproducible results when the workbench reaches into Platypus, you need to initialize both global random number generators:

import numpy.random
import random
numpy.random.seed(42)
random.seed(42)

quaquel · 2019-05-09T08:15:56Z

ah, had I know you were using platypus I would have been able to tell you this right away.

We might still consider having a set_random_state function on the workbench which sets both the numpy random state, the random state, and ensures that this is properly propagated to any subprocesses that are started.

Would that be a useful feature?

jpn-- · 2019-05-09T14:50:30Z

Modestly useful, not a high priority though.

quaquel · 2022-09-01T08:34:52Z

Started looking into this. This discussion on StackOverflow is quite useful. It seems the preferred approach is

from numpy.random import Generator, PCG64

# Case 3 (IMP) : Scipy uses an existing Random Generator which can being passed to Scipy based 
# random generator object
numpy_randomGen = Generator(PCG64(seed))
scipy_randomGen.random_state=numpy_randomGen
print(scipy_randomGen.rvs(n, p, size))
print(numpy_randomGen.binomial(n, p, size))
# prints
# [4 4 6 6 5 4 5 4 6 7]
# [4 8 6 3 5 7 6 4 6 4]
# This should be the case which we mostly want (DESIRABLE). If we are using both Numpy based and 
#Scipy based random number generators/function, then not only do we have no repetition of 
#random number sequences but also have reproducibility of results in this case.

scipy.stats is being used in samplers.py, and plotting_util.py
numpy.random is being used in prim.py, and samplers.py
random is being used in optimization.py, evaluators.py, and points.py

jpn-- closed this as completed May 8, 2019

quaquel reopened this May 9, 2019

quaquel added the enhancement label May 10, 2019

EwoutH assigned quaquel Aug 31, 2022

EwoutH added this to the 2.3.0 milestone Aug 31, 2022

quaquel mentioned this issue Oct 6, 2022

Run optimization for multiple seeds through evaluator.optimize #191

Open

EwoutH modified the milestones: 2.3.0, 2.4.0 Oct 25, 2022

quaquel modified the milestones: 2.4.0, 2.5.0 Apr 19, 2023

EwoutH modified the milestones: 2.5.0, 3.0 Dec 20, 2023

EwoutH changed the title ~~random state~~ Add control for random state Dec 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add control for random state #62

Add control for random state #62

jpn-- commented May 7, 2019

quaquel commented May 7, 2019

quaquel commented May 7, 2019

jpn-- commented May 7, 2019

quaquel commented May 7, 2019

quaquel commented May 8, 2019

jpn-- commented May 8, 2019

quaquel commented May 9, 2019

jpn-- commented May 9, 2019

quaquel commented Sep 1, 2022 •

edited

Loading

Add control for random state #62

Add control for random state #62

Comments

jpn-- commented May 7, 2019

quaquel commented May 7, 2019

quaquel commented May 7, 2019

jpn-- commented May 7, 2019

quaquel commented May 7, 2019

quaquel commented May 8, 2019

jpn-- commented May 8, 2019

quaquel commented May 9, 2019

jpn-- commented May 9, 2019

quaquel commented Sep 1, 2022 • edited Loading

quaquel commented Sep 1, 2022 •

edited

Loading