Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding option to include transformed variables in InferenceData #6232

Merged
merged 2 commits into from
Oct 20, 2022

Conversation

dfm
Copy link
Contributor

@dfm dfm commented Oct 20, 2022

This tiny PR adds an optional parameter to the ArviZ InferenceData backend to allow the (optional, if requested) inclusion of transformed parameters in the returned InferenceData object's posterior extension. There is discussion about building out a more formal interface for this in future versions of ArviZ (see: arviz-devs/arviz#230, arviz-devs/arviz#2056, arviz-devs/arviz#2086), but it would be a huge quality of life improvement (for me!) to have this available as an advanced user feature. I regularly use this for debugging sampling issues and for post-sampling plotting of model elements - it is an absolutely crucial component of my workflow! As far as I know, the only way to access these traces currently is to use the MultiTrace backend, which is annoying, since I'll always end up wanting to convert it to an InferenceData manually anyways.

Bugfixes / New features

  • Adds a optional argument to to_inference_data allowing the inclusion of transformed/unconstrained parameters in the returned InferenceData object

@codecov
Copy link

codecov bot commented Oct 20, 2022

Codecov Report

Merging #6232 (8b75fef) into main (7503730) will decrease coverage by 3.81%.
The diff coverage is 100.00%.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #6232      +/-   ##
==========================================
- Coverage   93.61%   89.79%   -3.82%     
==========================================
  Files         101      101              
  Lines       22120    22129       +9     
==========================================
- Hits        20707    19871     -836     
- Misses       1413     2258     +845     
Impacted Files Coverage Δ
pymc/backends/arviz.py 90.65% <100.00%> (+0.03%) ⬆️
pymc/tests/backends/test_arviz.py 99.02% <100.00%> (+0.01%) ⬆️
pymc/tests/distributions/test_bound.py 0.00% <0.00%> (-100.00%) ⬇️
pymc/tests/distributions/test_censored.py 0.00% <0.00%> (-100.00%) ⬇️
pymc/tests/distributions/test_simulator.py 0.00% <0.00%> (-99.51%) ⬇️
pymc/tests/distributions/test_truncated.py 0.00% <0.00%> (-99.49%) ⬇️
pymc/distributions/truncated.py 34.96% <0.00%> (-64.34%) ⬇️
pymc/distributions/bound.py 45.63% <0.00%> (-54.37%) ⬇️
pymc/distributions/simulator.py 34.04% <0.00%> (-53.20%) ⬇️
pymc/distributions/censored.py 92.50% <0.00%> (-7.50%) ⬇️
... and 2 more

Copy link
Contributor

@lucianopaz lucianopaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@OriolAbril
Copy link
Member

Can you also comment on arviz-devs/arviz#2086 if that would be helpful or not and why?

Will then that argument be used to indicate if the unconstrained group should be present? Or how will that work?

Also, for tracking purposes, related to #5160

@dfm
Copy link
Contributor Author

dfm commented Oct 20, 2022

Thanks @lucianopaz and @OriolAbril!

Can you also comment on arviz-devs/arviz#2086 if that would be helpful or not and why?

I think arviz-devs/arviz#2086 is good. For my workflow, it's actually easier if the transformed variables live in the posterior group (since, for example, I often want to plot the covariance between some transformed and un-transformed parameters; whichever parameters I'm sampling in), but I could certainly join those two groups. So I'd say that it would be a fine solution for me, but I'd probably always prefer to use something like what is proposed here.

Will then that argument be used to indicate if the unconstrained group should be present? Or how will that work?

I don't think this would be the right parameter (or interface) to use for the proposal in arviz-devs/arviz#2086. My proposal would be to implement the kwarg that I've added here since it's barely part of the public API, but makes something possible that it currently impossible. Then, more thought should probably be given to the best interface when implementing support for the unconstrained_posterior group (I expect this probably won't be via the awkward idata_kwargs interface!).

Copy link
Member

@michaelosthege michaelosthege left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While generally I'm not a fan of the InferenceDataConverter, I do understand the motiviation behind your PR, and I think the implementation is fine.

LGTM with one nitpick about test performance

pymc/tests/backends/test_arviz.py Outdated Show resolved Hide resolved
Co-authored-by: Michael Osthege <[email protected]>
@ricardoV94 ricardoV94 merged commit 13dfeb2 into pymc-devs:main Oct 20, 2022
@ricardoV94
Copy link
Member

Thanks @dfm

mattiadg pushed a commit to mattiadg/pymc that referenced this pull request Oct 21, 2022
…-devs#6232)

* Adding option to include transformed variables in InferenceData

Co-authored-by: Michael Osthege <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants