Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Help user pass newdata as sensible things (e.g. all strata) #213

Closed
Tracked by #238
athowes opened this issue Jul 31, 2024 · 11 comments
Closed
Tracked by #238

Help user pass newdata as sensible things (e.g. all strata) #213

athowes opened this issue Jul 31, 2024 · 11 comments

Comments

@athowes
Copy link
Collaborator

athowes commented Jul 31, 2024

In #210 we added functionality to produce predictions (of the delay internal and natural scale parameters) via brms::prepare_prediction for any family.

There is an argument newdata as follows:

An optional data.frame for which to evaluate predictions.
If NULL (default), the original data of the model is used.
NA values within factors are interpreted as if all dummy variables of this factor are zero.
This allows, for instance, to make predictions of the grand mean when using sum coding.

Following @seabbs who IMO correctly summarises where we should go:

We can also provide some functionality to either help users extract unique data points from there data (the simple version of all strata) and potentially to grid expand this to all combinations (i.e observed/unobserved) but I think the first pass should be... (what we have already done)

Basically, here we need to now help users to specify common newdata options.

Options as far as I see it are either:

  1. Helper functions for them to do that that they call outside predict_delay_samples
  2. Options in predict_delay_samples then put the helper functions inside predict_delay_samples

I probably favour 2. over 1. but could be convinced / not strong.

@seabbs
Copy link
Contributor

seabbs commented Jul 31, 2024

In terms of keeping things atomic my preference is helper functions outside of the prediction function

@athowes
Copy link
Collaborator Author

athowes commented Aug 1, 2024

  • newdata enforced as as_latent_individual or similar
  • Having ID column on data. Maybe different to case

@athowes
Copy link
Collaborator Author

athowes commented Aug 5, 2024

I do think we need to have newdata enforced as_latent_individual.

The question for me is say that we provide functionality to generate predictions for all strata, then what values to set for the other columns of newdata?

For example, say we have a model like 1 + sex on mu and sigma. Then to make predictions we still need a newdata with columns:

  • delay_central
  • sex
  • obs_t
  • pwindow_upr
  • swindow_upr

This is a little bit confusing to me. Is there some version of these predictions which is agnostic / integrates out / ... these other variables? Say I want to know about the expected delay distribution for a particular sex. Is there a version of that which isn't a function of the observation time?

@athowes
Copy link
Collaborator Author

athowes commented Aug 6, 2024

@seabbs
Copy link
Contributor

seabbs commented Aug 6, 2024

Nice that is useful. It looks like if we can plug into emmeans we can get most of the functionality a user might want much more simply.

@athowes
Copy link
Collaborator Author

athowes commented Aug 6, 2024

  1. Predict for all 500 individuals, check if predictions are the same
  2. Put all non covariates to NA in newdata and run
  3. Vary non covariates in newdata and check no change in output
  4. expand.grid on all covariates... how to extract covariates. Go into brms model and extract things in the formula
  5. NA out the covariates as well. Overall prediction?

@seabbs
Copy link
Contributor

seabbs commented Aug 7, 2024

If we do this with emmeans I am not sure we need to supply any helpers like this because it doesn't much of this for us

@athowes
Copy link
Collaborator Author

athowes commented Aug 7, 2024

I've almost finished writing a first helper function for the new strata. I might suggest we complete adding this function, then create a new issue for interacting with emmeans. We can compare outputs from any potential emmeans implementation with this helper function.

@athowes
Copy link
Collaborator Author

athowes commented Aug 7, 2024

Edit: the emmeans function is pretty good:

> emmeans::emmeans(fit_sex, specs = "sex")
 sex emmean lower.HPD upper.HPD
   0   2.02      1.95      2.08
   1   1.30      1.19      1.43

Limitations:

  • Only prediction of the mu parameter -- also need prediction of sigma
  • Have to provide some kind of fac.reduce ("a function that combines the rows of a matrix into a single vector)
    • What we want is all the samples. I think the philosophy of this package isn't very Bayesian leaning

Hence:

  • Likely would need to go into internals of emmeans to get function to do what we want
  • Wonder how similar those internals would be to what has been implemented already

Other:

  • Design of this makes me think rather than "all strata" we could move to having a helper function to generate all strata, and a helper function to predict on all cominations of the input strata

@seabbs
Copy link
Contributor

seabbs commented Aug 7, 2024

I think for a first pass we can get a lot of functionality from emmeans and we should do so (i.e just point out to it in the FAQ). I agree its not that bayesian but I am surprised you can't get samples out.

Once we have that in place (which is quite good coverage). I think we should think again about these strata functions (or if you have some in place we can do that sooner rather than later).

@athowes
Copy link
Collaborator Author

athowes commented Aug 7, 2024

Closed as not going to do (unless it's hard to get things working with other packages).

@athowes athowes closed this as not planned Won't fix, can't repro, duplicate, stale Aug 7, 2024
@athowes athowes mentioned this issue Aug 9, 2024
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants