-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update to new numpy RNG API #308
Conversation
The basic idea here is that the random state seeding is handled by the Simulator, then the random number generator produced by the state handling code is passed off to the SimulationComponent to be used in simulating data. The changes added in this commit will not work until the SimulationComponent children classes are updated to handle this difference in how random number generation is handled.
This isn't a very elegant solution, since there's a one-liner clause that pops up everywhere that has to do with all classes effectively needing a rng attribute to play nice with the Simulator, but this should work for now... I hope.
This class attribute helps the Simulator determine whether or not it should be providing a given model with a random number generator when calling _iteratively_apply
for more information, see https://pre-commit.ci
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #308 +/- ##
==========================================
+ Coverage 93.43% 93.49% +0.06%
==========================================
Files 24 24
Lines 3243 3276 +33
Branches 711 712 +1
==========================================
+ Hits 3030 3063 +33
+ Misses 117 116 -1
- Partials 96 97 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, great work @r-pascua, this looks really good.
This PR implements a shift to use the new API for generating random numbers with
numpy
. Now, rather than seed the globalnumpy
random state prior to simulating a component, users should provide anp.random.Generator
instance that is seeded as they desire. For folks who prefer to use theSimulator
, then they effectively don't need to change anything—the interaction with theSimulator
looks the same in basically every case (there are some corner cases, but, assuming I'm not missing something major, they're unusual enough that they should just be addressed in one-off conversations).The following simple rule can be applied to update code to be compliant with this change:
PRIOR TO THIS UPDATE
AFTER THIS UPDATE
In both cases,
args
are the positional arguments required by the simulation component's call signature, whilekwargs
are the keyword arguments that are typically stored as the class instance's attributes (i.e., these are the model parameters).The most substantial change here is how the
Simulator
juggles random states. This is hidden from the end user, but it is worthwhile to note the changes in some detail here for those interested in how theSimulator
internals function. Here is a high-level description of how the logic worked prior to this update, as well as a brief description of how it works after the changes made in this PR.PRIOR TO THIS UPDATE
Before simulating the desired effect with the
add
method, theSimulator
checks to see if the user has requested that the random state be managed in a particular way by checking the value of theseed
parameter. If the user provided something, then theSimulator
effectively enters a random state management routine before evaluating the model. Importantly, the random state is set during this random state management routine via the old APInp.random.seed(...)
. Seed caching is performed as-needed, depending on the type of seeding prescription requested.AFTER THIS UPDATE
The initial logic is the same as before—the
Simulator
checks to see whether it should set the random state prior to evaluating the model. The main difference now is that the_seed_rng
routine (the random state management routine mentioned above) returns both the (possibly updated)seed
as well as the random number generator that has been seeded as desired. Theseed
returned is just to ensure that theSimulator
has memory of what it was asked to do (which allows for cool stuff like making sure every baseline in a redundant group gets the same random seed, if desired), while the random number generator is passed along to the model and is ultimately used to make the random component of the model.