Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update case study of hierarchical models for binary trials #214

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

mitzimorris
Copy link
Member

Updated models and notebook for Stan case study "Hierarchical Partial Pooling for Repeated Binary Trials"

  • use affine transform for non-centered hierarchical logit model
  • use include directives for data, transformed data, and generated quantities code shared across all models
  • rework Rmd notebook so that it runs with CmdStanR

@mitzimorris mitzimorris requested a review from WardBrian January 24, 2022 22:26
@mitzimorris
Copy link
Member Author

This PR updates the Rmd notebook and the HTML
A subsequent PR will break this case study into multiple parts and create Jupyter notebooks.

@WardBrian
Copy link
Member

To actually update the one people see I think there needs to be a separate PR to https://github.com/stan-dev/stan-dev.github.io, right?

I’ll take a look at this tomorrow

@mitzimorris
Copy link
Member Author

If you OK this PR, that should be sufficient review for the stan-dev.github.io pages as well.

@mitzimorris
Copy link
Member Author

can we turn off compilation for the include fragments? if so, how?

@WardBrian
Copy link
Member

I think @rok-cesnovar put it together. My guess is provably naming them something other than .stan would work, but then you lose syntax highlighting etc

@mitzimorris
Copy link
Member Author

@rok-cesnovar - the include files are in a directory named "include" - can we do something with this information?
for this case study, using includes makes a lot of sense. also, it's good to have more examples of how this works somewhere.

Copy link
Member

@WardBrian WardBrian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments on the Rmd file. The reorganization of the stan files using #include looks good

@@ -160,33 +216,27 @@ due to numerical arithmetic issues.

Assuming each player's at-bats are independent Bernoulli trials, the
sampling distribution for each player's number of hits $y_n$ is modeled as

\[
$$
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a minor detail, but generally using \[ \] is preferred over $$ $$. The later is a pure-TeX shorthand which can cause some issues with LaTeX macros. Not sure it matters for us, but just thought I'd note. More on Wikibooks

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, but only $$ works in Jupyter notebooks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this being run in a jupyter notebook now? Or is the idea that it will be recreated in one?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it would be good to be able to run these in jupyter, either in Python or R.

I used jupyter to update this notebook, so I hit this right off the bat.
awhile back I installed juptext on my machine and I prefer jupyter to RStudio - https://mc-stan.org/users/documentation/case-studies/jupyter_colab_notebooks_2020.html#challenge-putting-the-r-in-jupyter

With juptext and IRkernal, I can run the notebook while keeping it in .Rmd format which means that it's easy to edit via the browser of in emacs. but that's just me.

We must first instantiate a CmdStanModel object then call its <code>sample</code> method.
To do this we first call the `cmdstan_model()` function
which instantiates a CmdStanModel object from a file containing a Stan program.
Because these models contain a number of [include directives](https://mc-stan.org/docs/reference-manual/includes-section.html),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Because these models contain a number of [include directives](https://mc-stan.org/docs/reference-manual/includes-section.html),
Because these models contain a number of [include directives](https://mc-stan.org/docs/reference-manual/includes.html),

With -section it will link to the 2_27 version (at least until we fix that redirect)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still needs to be changed (I think you changed a different copy of the link which I hadn't caught)

knitr/pool-binary-trials/pool-no-pool.Rmd Outdated Show resolved Hide resolved
@rok-cesnovar
Copy link
Member

Feel free to ignore the testing errors, will fix the GHA script later this week to ignore the files you mentioned.

@WardBrian
Copy link
Member

It doesn't let me comment directly on the rendered HTML, but here are some thoughts on the way the output looks:

  • Can we hide the 'A newer version of CmdStan is available. See ?install_cmdstan() to install it.' message? If possible it would also be great to hide some (if not all) of these setup messages such as the cmdstan path
  • There's some strange formatting around the code blocks at the end. Some of the titles seem to stretch onto the next line while still in typewritertext
  • I'm not sure how the html is built from the Rmd in this case - would it be possible to get syntax highlighting of the Stan code?

@mitzimorris
Copy link
Member Author

Can we hide the 'A newer version of CmdStan is available. See ?install_cmdstan() to install it.' message? If possible it would also be great to hide some (if not all) of these setup messages such as the cmdstan path

I think this is an CmdStanR bug. I actually have CmdStan 2.28.2 installed. I also have a version of develop in my .cmdstan directory and I named that dir cmdstan-develop - CmdStanPy ignores it, but CmdStanR wanted to use that and not 2.28.2. @rok-cesnovar - what are the rules for CmdStanR?

@WardBrian
Copy link
Member

It mentions an environment variable to disable it for now

@mitzimorris mitzimorris marked this pull request as draft January 26, 2022 21:34
@mitzimorris
Copy link
Member Author

I just converted it to a draft because there's more work to be done here.

will investigate formatting issues.

it would definitely be nice to get Stan syntax highlighting into the HTML
and I'm pretty sure we can hack the .Rmd to make that happen.

@WardBrian
Copy link
Member

What program is used to ultimately create the HTML? Pandoc?

@rok-cesnovar
Copy link
Member

@rok-cesnovar - what are the rules for CmdStanR?

We sort the folder in .cmdstan, not sure why -develop would be ahead though. Will check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants