-
-
Notifications
You must be signed in to change notification settings - Fork 407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Example using multidimensional coordinates to label a dimension for a parameter? #1628
Comments
I think there are no examples for now, and I think we don't even have example inferencedata objects available that have more more than 3 dims (counting chain and draw), which would be great to show this kind of things in the documentation. Do you have (and can share) or know of some public examples that we could use for that? Getting multidimensional coordinates after having created the inferencedata should be possible and work basically like in pure xarray, getting multidimensional coordinates added directly at inferencedata creation is currently not possible (I think, but have not tried). Following this section in xarray docs, I believe the following should work. Assuming in inferencedata with posterior like:
then doing: idata.posterior.coords["lat"] = (("x", "y"), lat)
idata.posterior.coords["lat"] = (("x", "y"), lat)
# or
idata.assign_coords({"lat": (("x", "y"), lat), "lon": (("x", "y"), lat)}, groups="posterior")
# would also work if done on idata.posterior instead of idata
# using idata is good for convenience itself and to add the same coords to multiple groups at the same time should end up in:
I should also note that there is an ArviZ specific con to non indexing coordinates, which is that they are not available to labellers. |
On this topic:
Yes. Thank you .. I noticed this, but I haven't carried through all the implications of it just yet [i saw from issues that you are working on the doc for this]. I wanted to note that the xarray.swap_dims operation made it pretty easy to switch from one set of coordinates to another. In my example above, this looked like: cmdstanpy_data.posterior.theta.swap_dims(school = 'school_name') I know of a few examples with multiple indices used in Stan development; many are written up in the Stan case studies .. for example the HMM model for basketball players. |
Short Description
I'm trying to fit a time-series model which my observed data are in a ragged-array or denormalized format (one row per subject-time), and I would like to label certain dimensions with additional information: not only an observation_id, but also subject_id and time. This would apply to predicted quantities, some parameters (depending on use case), and the observed data.
I am considering using xarray's multi-dimensional coordinates to add non-dimension coordinates to the posterior draws, but I haven't been able to put the pieces together just yet. Curious if there is an example of how to use these for inference data?
Aside: this is an example of this use case, one I expect is most common, but I am also thinking of other use cases for multidimensional coordinates. As an example, when fitting hierarchical models, it would be nice to label lower-level parameters with the higher-level group of which they are members.
Code Example or link
This turned out to be a LOT easier than I thought it would be. Including here in case it helps others.
Building on the from CmdStanPy example in the docs:
Now, our inference data has the original index plus the name of the school:
For example:
Relevant documentation or public examples
Please provide documentation, public examples, or any additional information which may be relevant to your question
The text was updated successfully, but these errors were encountered: