Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reducing model space based on the number of models that satisfy hereditary constraints #88

Open
merliseclyde opened this issue Nov 21, 2024 · 0 comments
Labels
CRAN Note Notes on cran-checks to address enhancement work in progress

Comments

@merliseclyde
Copy link
Owner

merliseclyde commented Nov 21, 2024

Currently algorithms in BAS allocate n.models for storage based on user input or if feasible equal to 2^p for enumeration (BAS, deterministic, MCMC+BAS. This is reduced if variables are forced to always be included include.always = ~ X1 + X2 + X1:X2

For models with factors or orthogonal polynomials, practice is to include higher order terms only if lower order "parents" are included in the model. These constraints are imposed in the sampling algorithms BAS and MCMC (but not the deterministic, MCMC+BAS or AMCMC search mechanisms.

counting models under hierarchical constraints is added to bas.lm in the function n.models = count.heredity.models(mf, n.models) and added a unit test in test-interactions.R. This does not count all models under heredity constraints as that becomes expensive for a large number of factors and higher order of interactions, and instead stops for higher orders if the number exceeds the pre-specified cap on the number of models to sample.

However this does not catch the following cases:

  • factors and terms that are always included in the model via include.always
  • terms from orthogonal polynomials. For the latter, the function in count.heredity.models does not expand the function poly based on the degree and hence underestimates the number of models. Since they are orthogonal polynomials the case could be made the the hierarchical constraint could be dropped, but it is critical that the number of models reflects the basis as in the function make.parents.of.interactions

Implementing should permit eliminating the use of SETLENGTH in lm_sampleworep.c and glm_sampleworep.c

@merliseclyde merliseclyde added enhancement work in progress CRAN Note Notes on cran-checks to address labels Nov 21, 2024
merliseclyde added a commit that referenced this issue Nov 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CRAN Note Notes on cran-checks to address enhancement work in progress
Projects
None yet
Development

No branches or pull requests

1 participant