Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stan model chunks #484

Closed
wants to merge 15 commits into from
Closed

stan model chunks #484

wants to merge 15 commits into from

Conversation

sbfnk
Copy link
Contributor

@sbfnk sbfnk commented Oct 24, 2023

Testing whether we could move duplicated stan code into chunks.

I like the way it simplifies the main models and highlights commonalities. It also makes coding more fault tolerant.

On the other hand it will become even harder to trace code having to navigate, at times, to chunks that then call functions contained in other files. This could be mitigated by, e.g., making functions that are only called once chunks (e.g. the GP functions); or by moving each stan function into a separate files (though would complicate ensuring all relevant functions are loaded for all chunks) or all into one file.

In conclusion, I'm on the fence regarding whether this is a good idea. If going ahead with this there are a few bits of code that could potentially still be moved into chunks.

A few processing functions had to be updated in order to make this work. Also the stan code is probably less efficient in places - touchstone will tell us how much.

Closes #381

@jamesmbaazam
Copy link
Contributor

On the issue of navigating the code, can we use prefixes to differentiate between the stan code in ./inst/stan/functions/ directory and those in the ./inst/stan/data/ directory? We could prefix the functions with fnc_* and those in data with "data_". I find that some stan files have the same name in both folders and can be confusing when opened side by side.

@github-actions
Copy link
Contributor

This is how benchmark results would change (along with a 95% confidence interval in relative change) if dc29e32 is merged into main:

  •   :ballot_box_with_check:default: 57.6s -> 1.01m [-8.61%, +18.58%]
  •   :ballot_box_with_check:no_delays: 1.08m -> 1.09m [-16.28%, +18.26%]
  •   :ballot_box_with_check:random_walk: 18s -> 18.8s [-11.13%, +20.41%]
  •   :rocket:stationary: 38s -> 34.2s [-18.58%, -1.55%]
  •   :ballot_box_with_check:uncertain: 1.39m -> 1.43m [-7.51%, +12.21%]
    Further explanation regarding interpretation and methodology can be found in the documentation.

@sbfnk
Copy link
Contributor Author

sbfnk commented Oct 24, 2023

the stan code is probably less efficient in places

That turned out to be of no concern.

@github-actions
Copy link
Contributor

This is how benchmark results would change (along with a 95% confidence interval in relative change) if 4470973 is merged into main:

  •   :ballot_box_with_check:default: 54.2s -> 54.2s [-12.67%, +12.86%]
  •   :ballot_box_with_check:no_delays: 57.8s -> 1.15m [-5.72%, +43.82%]
  •   :ballot_box_with_check:random_walk: 20.2s -> 16.8s [-54.35%, +20.53%]
  •   :ballot_box_with_check:stationary: 33.5s -> 41.8s [-14.36%, +63.99%]
  •   :ballot_box_with_check:uncertain: 1.28m -> 1.33m [-15.64%, +22.85%]
    Further explanation regarding interpretation and methodology can be found in the documentation.

@seabbs
Copy link
Contributor

seabbs commented Nov 1, 2023

I find that some stan files have the same name in both folders and can be confusing when opened side by side.

So this is intentional in order to show which data chunks link with which function chunks etc? In the viewer I look at these from the tree structure of them being in folders then provides the distinction. Interesting to hear this isn't working out for you though. I would ideally like a solution that doesn't duplicate the info in the folder label in the filename though.

@seabbs
Copy link
Contributor

seabbs commented Nov 1, 2023

In conclusion, I'm on the fence regarding whether this is a good idea. If going ahead with this there are a few bits of code that could potentially still be moved into chunks.

Yeah, I feel the same way for all the reasons you mention. I think I would much prefer doing this manually (i.e having everything in chunks and then providing R code to generate the complete models (this would need to be done at build time/whenever the chunks were modified (likely in CI to be sure)). In this mists of time I did some work on this in a branch of idbrms that never went anywhere - I think that wouldn't be that hard to revive.

The issue there would be how we maintain the code for generating the models of course...

@sbfnk
Copy link
Contributor Author

sbfnk commented Nov 7, 2023

In conclusion, I'm on the fence regarding whether this is a good idea. If going ahead with this there are a few bits of code that could potentially still be moved into chunks.

Yeah, I feel the same way for all the reasons you mention. I think I would much prefer doing this manually (i.e having everything in chunks and then providing R code to generate the complete models (this would need to be done at build time/whenever the chunks were modified (likely in CI to be sure)). In this mists of time I did some work on this in a branch of idbrms that never went anywhere - I think that wouldn't be that hard to revive.

The issue there would be how we maintain the code for generating the models of course...

Isn't there also an issue that this would require management of between-snippet dependencies, inflating complexity? Unless perhaps we did away with functions.

@seabbs
Copy link
Contributor

seabbs commented Nov 7, 2023

Isn't there also an issue that this would require management of between-snippet dependencies, inflating complexity? Unless perhaps we did away with functions.

Hmm I'm not sure how much complexity it would really add to not duplicate functions? Or do you mean something more than this?

@sbfnk
Copy link
Contributor Author

sbfnk commented Nov 7, 2023

Isn't there also an issue that this would require management of between-snippet dependencies, inflating complexity? Unless perhaps we did away with functions.

Hmm I'm not sure how much complexity it would really add to not duplicate functions? Or do you mean something more than this?

Perhaps I'm misunderstanding what you're suggesting. If we compose a model of chunks then chunks will depend on other chunks (defining functions or declaring variables) which themselves might have dependencies.

@seabbs
Copy link
Contributor

seabbs commented Nov 7, 2023

Yes but can't you define those in a list? We aren't saying we are going to have a system for composing models - just the models we already have?

@sbfnk
Copy link
Contributor Author

sbfnk commented Jan 16, 2024

I think we concluded that this wasn't a good idea so I'm closing the PR.

@sbfnk sbfnk closed this Jan 16, 2024
@sbfnk sbfnk deleted the stan-chunks branch May 3, 2024 19:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Put duplicated stan code in chunks
3 participants