Use an incrementally updated RNG state in `Model` #4729

brandonwillard · 2021-05-28T01:31:33Z

This PR updates the role of Model.default_rng by adding a Model.next_rng that is incrementally updated when variables are registered.

Under this change, Model.default_rng still serves as a mutable starting point for RNG seeding, while also (softly) guaranteeing that newly added/registered random variables—that don't specify their own rng parameters—will be unique.

This PR also updates how initial values are specified, stored, and generated. Since computing initial values may require sampling, these updates were needed in order to prevent initial values from unnecessarily affecting the RNG states in confusing ways.

Closes #4728

ricardoV94 · 2021-05-28T04:10:59Z

I guess this also affects variables created via .dist()? We need to keep that in mind until we decide what to do with that API.

ricardoV94 · 2021-05-28T04:18:32Z

Are successive rngs guaranteed to not repeat or just nearly so?

brandonwillard · 2021-05-28T04:27:00Z

I guess this also affects variables created via .dist()?

Howso?

ricardoV94 · 2021-05-28T04:30:53Z

NVM, I thought they were also given the model rng

michaelosthege · 2021-05-28T15:29:41Z

If we keep the uncoordinated merge of #4718 it forces me to rebase the #4696 branch--unless someone else steps in to help out.
Possibly more because of emotional than technical reasons I am not at all looking forward to do this rebase.

Call me overreacting, but out of respect for a fellow core contributor please don't merge anything into v4 until we have discussed how to proceed as a team.

brandonwillard · 2021-05-28T15:56:03Z

Call me overreacting, but out of respect for a fellow core contributor please don't merge anything into v4 until we have discussed how to proceed as a team.

Out of respect for the core developers and anyone who might be interested in using v4, I would hope that the core developers would prioritize self-contained bug fixes like this and #4718 over large, non-critical features such as #4696.

Regardless, I won't merge this until others weigh in.

twiecki · 2021-05-28T17:18:55Z

I don't remember that we've locked branches (including master) before for other PRs, so this would be something new. We can certainly discuss whether this is a good idea, I personally don't think it is as it would slow us down tremendously if a single PR that we have difficulty merging brings development to a complete halt.

We do have tools to resolve this, namely a rebase, and I don't see how it would be a particularly difficult one in this case. Plus, we will have to rebase #4696 anyway. I'm not at all saying that's on you @michaelosthege as you said before that the PR is done from your side, and that's totally fine.

twiecki · 2021-05-28T17:36:05Z

@brandonwillard So changing the rng is only to make RVs unique? I guess I'm a bit confused why they aren't unique to begin with and if we shouldn't attack the problem there?

brandonwillard · 2021-05-28T18:00:09Z

@brandonwillard So changing the rng is only to make RVs unique? I guess I'm a bit confused why they aren't unique to begin with and if we shouldn't attack the problem there?

No, not exclusively; changing the rng for each variable is also generally more consistent (e.g. it doesn't force one to use in-place Ops and it provides at least the same level of configurability/control).

Otherwise, the uniqueness in question is consistent and fundamental to general graph representations, and also how Aesara is designed: if the same Op is applied to the same inputs, the resulting nodes and their outputs are equivalent. This is the basis of "merge" optimizations, which prevent graphs from computing the same things more than once.

We could make RandomVariable nodes unique in other ways (e.g. by adding new inputs that could be used to uniquely identify them), but the points above would remain, and, because the rng solution is the design-intended approach—and addresses other issues as well—there's no need to invent anything new (that doesn't entail other improvements, of course).

Aside from all that, this lack of uniqueness can result in real bugs, because a compiled function that conflates equivalent RandomVariables—like the x and y in the example for #4728—will drop one of the arguments (definitely when on_unused_input="ignore" is set), so a user attempting to provide such a function with distinct values for each variable will find that the results are not correct.

brandonwillard · 2021-05-30T05:27:35Z

If we changed this so that it worked more like RandomStream (i.e. generate a new RandomState for each value of model.next_rng), we could avoid producing very inter-connected graphs. This might be the better approach.

Update: it's definitely the better approach; I'll rebase this with the changes soon.

brandonwillard · 2021-05-31T20:54:23Z

The changes I just pushed allow Model to generate new RNG states like RandomStream does, as mentioned previously.

The Model object will keep a list of the RNG states it has generated, in case we want to allow easier model re-seeding.

Also, the forced in-place RandomVariable Op creation has been removed and replaced with the intended optimization-based approach. In other words, when aesara.function is used with a mode that includes the RandomVariable in-place optimization step, it will convert each RandomVariable Op in the graph into its in-place equivalent, and allow evaluations of a compiled function to update its graph's RNG states (i.e. produce new samples each time).

The standard FAST_RUN mode includes this optimization, but, to be safe, I've created a pymc3.aesaraf.compile_rv_inplace function that will add this particular in-place optimization to an existing mode. PyMC will use this helper function wherever it needs to compile a graph, ensuring that the result produces distinct samples.

brandonwillard · 2021-05-31T21:15:12Z

The current failure is pretty straightforward: Model.initial_point uses Variable.eval, which compiles a function in FAST_RUN mode and alters the RNG state in-place. Since Model.initial_point is implicitly called by pm.load_trace, the subsequent pm.sample_posterior_predictive call is starting at a different RNG state than the earlier one, and that causes the results to differ.

It's a pretty straightforward fix (i.e. don't use Variable.eval), but it's also a good time to clean up initial_point a little bit, so I'll do that as a means of fixing this.

pymc3/model.py

ricardoV94

Should we test that rvs can still be merged when they are in fact repeated in the graph expression?

pymc3/tests/test_distributions.py

brandonwillard · 2021-06-01T06:14:25Z

Should we test that rvs can still be merged when they are in fact repeated in the graph expression?

No, that's very much Aesara's responsibility.

ricardoV94

LGTM

junpenglao · 2021-06-01T10:20:46Z

testval -> initval is a breaking change that is not backward compatible, are we ok without a deprecation cycle here?

michaelosthege · 2021-06-01T12:12:32Z

testval -> initval is a breaking change that is not backward compatible, are we ok without a deprecation cycle here?

I agree that we should reconsider this breaking change. The value is still assigned to .tag.test_value and "initval" may confuse users that it has something to do with MCMC initialization.
If we decide to deprecate it, there should be a warning in the .dist API, similar to the warning added to the __new__ API.

michaelosthege

(see comment about breaking change above)

brandonwillard · 2021-06-01T14:39:23Z

testval -> initval is a breaking change that is not backward compatible, are we ok without a deprecation cycle here?

What's a breaking change? A deprecation warning was added, but testval is still usable for now.

Regardless, folks, don't forget that this is a new version, and that there's no practical benefit to preserving a misnomer like testval.

pymc3/distributions/distribution.py

twiecki · 2021-06-01T15:21:42Z

I think the deprecation warning we have here is fine, but should definitely add a note in the release-notes on this change.

pymc3/distributions/distribution.py

brandonwillard · 2021-06-01T15:56:02Z

The value is still assigned to .tag.test_value and "initval" may confuse users that it has something to do with MCMC initialization.

The former confirms that this is not a breaking change, and the latter wouldn't be a confusion, because initval does have something to do with MCMC initialization, just like the old test values.

Yet another warning has been added

brandonwillard self-assigned this May 28, 2021

brandonwillard linked an issue May 28, 2021 that may be closed by this pull request

Make sure Model RVs are distinct via their RNGs #4728

Closed

brandonwillard added bug v4 labels May 28, 2021

ricardoV94 previously approved these changes May 29, 2021

View reviewed changes

brandonwillard marked this pull request as draft May 30, 2021 05:27

brandonwillard dismissed ricardoV94’s stale review via 53f6f43 May 31, 2021 20:43

brandonwillard force-pushed the use-distinct-rngs branch from c6e31b5 to 53f6f43 Compare May 31, 2021 20:43

brandonwillard mentioned this pull request May 31, 2021

Fix some usability issues surrounding shared RNG objects aesara-devs/aesara#454

Closed

brandonwillard force-pushed the use-distinct-rngs branch 2 times, most recently from 3fedaa2 to d279e89 Compare June 1, 2021 02:49

brandonwillard added 2 commits May 31, 2021 22:01

Incrementally update the RNG state in Model

45801f5

Convert RandomVariables to in-place during graph optimization

84f6207

brandonwillard force-pushed the use-distinct-rngs branch 4 times, most recently from 5a63903 to a48db14 Compare June 1, 2021 03:28

brandonwillard marked this pull request as ready for review June 1, 2021 03:32

brandonwillard force-pushed the use-distinct-rngs branch 2 times, most recently from bb3c3c4 to 186b4f5 Compare June 1, 2021 04:42

ricardoV94 reviewed Jun 1, 2021

View reviewed changes

pymc3/model.py Show resolved Hide resolved

brandonwillard force-pushed the use-distinct-rngs branch from 186b4f5 to 8dd92e5 Compare June 1, 2021 06:04

ricardoV94 reviewed Jun 1, 2021

View reviewed changes

pymc3/tests/test_distributions.py Show resolved Hide resolved

ricardoV94 previously approved these changes Jun 1, 2021

View reviewed changes

michaelosthege previously requested changes Jun 1, 2021

View reviewed changes

twiecki reviewed Jun 1, 2021

View reviewed changes

pymc3/distributions/distribution.py Outdated Show resolved Hide resolved

ricardoV94 mentioned this pull request Jun 1, 2021

Write v4 upgrade guide #4515

Closed

brandonwillard dismissed ricardoV94’s stale review via 62530f5 June 1, 2021 15:37

brandonwillard force-pushed the use-distinct-rngs branch from 8dd92e5 to 62530f5 Compare June 1, 2021 15:37

ricardoV94 reviewed Jun 1, 2021

View reviewed changes

pymc3/distributions/distribution.py Outdated Show resolved Hide resolved

brandonwillard force-pushed the use-distinct-rngs branch from 62530f5 to dd0c577 Compare June 1, 2021 15:45

twiecki previously approved these changes Jun 1, 2021

View reviewed changes

brandonwillard added 2 commits June 1, 2021 10:53

Introduce Model.initial_values and deprecate testval in favor of initval

aa22bcc

Replace uses of testval with initval

b1f95e5

brandonwillard dismissed twiecki’s stale review via b1f95e5 June 1, 2021 15:53

brandonwillard force-pushed the use-distinct-rngs branch from dd0c577 to b1f95e5 Compare June 1, 2021 15:53

brandonwillard mentioned this pull request Jun 2, 2021

Convert codebase to PyMC v4 AmpersandTV/pymc3-hmm#86

Draft

3 tasks

twiecki approved these changes Jun 2, 2021

View reviewed changes

brandonwillard merged commit ad9b919 into pymc-devs:v4 Jun 2, 2021

brandonwillard deleted the use-distinct-rngs branch June 2, 2021 15:35

ricardoV94 mentioned this pull request Jun 4, 2021

Issue with recent initval changes #4737

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use an incrementally updated RNG state in `Model` #4729

Use an incrementally updated RNG state in `Model` #4729

brandonwillard commented May 28, 2021 •

edited

Loading

ricardoV94 commented May 28, 2021 •

edited

Loading

ricardoV94 commented May 28, 2021

brandonwillard commented May 28, 2021

ricardoV94 commented May 28, 2021

michaelosthege commented May 28, 2021

brandonwillard commented May 28, 2021

twiecki commented May 28, 2021

twiecki commented May 28, 2021

brandonwillard commented May 28, 2021 •

edited

Loading

brandonwillard commented May 30, 2021 •

edited

Loading

brandonwillard commented May 31, 2021 •

edited

Loading

brandonwillard commented May 31, 2021

ricardoV94 left a comment

brandonwillard commented Jun 1, 2021

ricardoV94 left a comment

junpenglao commented Jun 1, 2021

michaelosthege commented Jun 1, 2021 •

edited

Loading

michaelosthege left a comment

brandonwillard commented Jun 1, 2021

twiecki commented Jun 1, 2021

brandonwillard commented Jun 1, 2021

Use an incrementally updated RNG state in Model #4729

Use an incrementally updated RNG state in Model #4729

Conversation

brandonwillard commented May 28, 2021 • edited Loading

ricardoV94 commented May 28, 2021 • edited Loading

ricardoV94 commented May 28, 2021

brandonwillard commented May 28, 2021

ricardoV94 commented May 28, 2021

michaelosthege commented May 28, 2021

brandonwillard commented May 28, 2021

twiecki commented May 28, 2021

twiecki commented May 28, 2021

brandonwillard commented May 28, 2021 • edited Loading

brandonwillard commented May 30, 2021 • edited Loading

brandonwillard commented May 31, 2021 • edited Loading

brandonwillard commented May 31, 2021

ricardoV94 left a comment

Choose a reason for hiding this comment

brandonwillard commented Jun 1, 2021

ricardoV94 left a comment

Choose a reason for hiding this comment

junpenglao commented Jun 1, 2021

michaelosthege commented Jun 1, 2021 • edited Loading

michaelosthege left a comment

Choose a reason for hiding this comment

brandonwillard commented Jun 1, 2021

twiecki commented Jun 1, 2021

brandonwillard commented Jun 1, 2021

Use an incrementally updated RNG state in `Model` #4729

Use an incrementally updated RNG state in `Model` #4729

brandonwillard commented May 28, 2021 •

edited

Loading

ricardoV94 commented May 28, 2021 •

edited

Loading

brandonwillard commented May 28, 2021 •

edited

Loading

brandonwillard commented May 30, 2021 •

edited

Loading

brandonwillard commented May 31, 2021 •

edited

Loading

michaelosthege commented Jun 1, 2021 •

edited

Loading