-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Review scoringutils 2.0.0 #791
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, I think this package has much improved in 2.0.0. In particular, despite my comment on this later, I think the "happy path" is getting clearer and clearer, which is definitely the right direction.
I've left numerous comments but would recommend focus on changes impacting the user interface, so that breaking changes are grouped as much as possible in 2.0.0. Internal changes can wait the next minor version.
Package Review
- Version tested: aed70b8
Documentation
The package includes all the following forms of documentation:
- ☒ A statement of need: clearly stating problems the software is designed to solve and its target audience in README
- ☒ Installation instructions: for the development version of package and any non-standard dependencies in README
- ☒ Vignette(s): demonstrating major functionality that runs successfully locally
- ☒ Function Documentation: for all exported functions
- ☒ Examples: (that run successfully locally) for all exported functions
- ☒ Community guidelines: including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with
URL
,BugReports
andMaintainer
(which may be autogenerated viaAuthors@R
).
Functionality
-
☒ Installation: Installation succeeds as documented.
-
☒ Functionality: Any functional claims of the software have been confirmed.
-
☒ Performance: Any performance claims of the software have been confirmed.
-
☒ Automated tests: Unit tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.
-
☒ Packaging guidelines: The package conforms to the rOpenSci packaging guidelines.
Estimated hours spent reviewing: 6
Review Comments
Short-term
Class structure
-
I see it has been discussed a bit in Discussion: should we require users to call
as_forecast()
beforescore()
? #507 but I am uncomfortable with the fact we re-validate the class in theprint()
method. Not just because it’s inefficient and non-standard, but mostly because it may be the symptom of a deeper issue. Once it has been created, we should ensure (to the best of our ability) that it remains valid no matter what. And the moment it stopps being valid, we should throw an error immediately, not wait for theprint()
method to be called.This is an inherent weakness of S3 classes, which are loosely defined, but we can add safeguards. This could take the form of a custom
[
method which ensures crucial columns are never dropped. -
I wonder if there should be a
forecast
abstract class, from whichforecast_binary
,forecast_point
, etc. would inherit. In the docs and exported functions (e.g.,is_forecast()
,validate_forecast()
), things appear as if there is such as class, and the actual classforecast_binary
,forecast_point
, etc. as not so visible. -
On an unrelated note, should functions such as
sample_to_quantile()
be converted to use the class infrastructure? They would become something likeas.forecast_sample.forecast_quantile()
NAMESPACE
I believe scoringutils NAMESPACE
is much larger than what it should be, which makes it difficult to approach and maintain. We probably need to have a longer discussion about this but some early thoughts: - should the low-level functions really be exported? There is no workflow in the vignettes presenting when it would make sense to use them directly. - functions such as run_safely()
should definitely not be exported IMO since they don’t fit with scoringutils core functionality. Instead, we should encourage users to use the tools they are already familiar for this kind of work beyond scoringutils scope, such as tryCatch()
or purrr::safely()
. - assert_forecast_generic()
is marked as internal, but also exported
git repository
Am I correct that most figures in man/figures/
are no longer used? If so, they should be removed.
Middle-term
Happy path
I was unsure if this should be a short-term or middle-term issue. My main concern with putting it in middle-term is that it will result in breaking changes, which always are a pain, both for users and developers. At the same time, they may require some user feedback, and will likely require some time to implement, so maybe it shouldn’t delay the 2.0.0 release. Some of it has been discussed in the NAMESPACE
section but this section mentions other situations as well.
In the same line of thought as #507, it is probably good to have a clearer “happy path” / “recommended workflow”. Options that differ from this path should be removed in many cases to keep the user on the happy path, reduce information overload, and facilitate maintenance. In general, there should not be two nearly identical ways to perform the same task.
For example, in summarize_score()
, the by
and across
arguments are redundant, which is confusing for the user, and forces the developers to jump through extra input checking hoops. Since the transformation of one to the other, I would recommend being more opinionated, and having a single, well-documented option.
Suppressing output
scoringutils codebase contains a lot of try()
, suppressMessages()
, suppressWarnings()
. This is likely the symptoms that a refactoring or redesign is necessary. In particular, one potential cause can be the lack of delineation between internal and external functions, as discussed in the earlier NAMESPACE
section.
try()
, suppressMessages()
, suppressWarnings()
should usually be a last resort, and not a common pattern in the codebase. Since they affect all conditions, not just conditions created by scoringutils, they can also hide deeper issues, and make debugging harder.
Long-term
Scaffolding
- The Metrics R package doesn’t seem actively maintained and may present a risk to the long-term sustainability of the project. From what I can tell, we only use it for relatively straightforward operations so I wonder if it’s really worth depending on it. A potential strong argument for having such a dependency is if it provided a significant speed boost by using C/C++ or Rust, especially for operations used in loops. But Metrics uses pure R so provides limited value-add in our case.
- Conversely, I would probably recommend depending on the MatrixStats package. It’s a well-established package by an active member of the R community. It simple, clearly-named, and more efficient (written in C) replacement for the various
apply()
calls throughout the codebase.
Plots
I believe I mentioned it in the past but I am still uncomfortable with the plot_heatmap()
function. The output here should be a table, not a plot. This comes with accessibility issues, as the values are no longer accessible to screen readers, and it’s harder to copy-paste the values. There are good options to have tables with coloured cells today, and it should probably be the way to go.
R/summarise_scores.R
Outdated
#' @param by Character vector with column names to summarise scores by. Default | ||
#' is `model`, meaning that there will be one score per model in the output. | ||
#' @param across Character vector with column names to summarise scores | ||
#' across (meaning that the specified columns will be dropped). This is an | ||
#' alternative to specifying `by` directly. If `across` is set, `by` will be | ||
#' ignored. If `across` is `NULL` (default), then `by` will be used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we be more opinionated here? See the section "Happy path" in my general review comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so but @nikosbsse I know finnds this very useful as a user so hard to say
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sbfnk what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm all for opinionated paths
Nice still digesting but
I have a PR for you #790 |
This is great, thanks so much! |
Yes I agree. |
Co-authored-by: Hugo Gruson <[email protected]>
Co-authored-by: Hugo Gruson <[email protected]>
Co-authored-by: Hugo Gruson <[email protected]>
Co-authored-by: Hugo Gruson <[email protected]>
Co-authored-by: Hugo Gruson <[email protected]>
Co-authored-by: Hugo Gruson <[email protected]>
Co-authored-by: Hugo Gruson <[email protected]>
Co-authored-by: Hugo Gruson <[email protected]>
Summary of action items based on the discussions (in addition to what's mentioned in the general review comments:
|
* correct orcid statements in DESCRIPTION * Automatic readme update [ci skip] * update docs * revert suggested change * update test * make import explicit * update import * delete unnecessary comment --------- Co-authored-by: GitHub Action <[email protected]>
uhm... I tried splitting up the already accepted changes into a new PR. But when I merged this, github was smart enough to see that the latest commit was the same as this one so this PR got "merged" as well and is closed now. Sorry! |
@Bisaloo, you mentioned this:
I created an issue for |
This is a full package review, following the process documented in https://epiverse-trace.github.io/blueprints/code-review.html#full-package-review.
It allows us to use commit suggestions, and annotate the code directly by hijacking GitHub PR review infrastructure.
Once this is done, this PR can be updated to target
main
to directly integrate the changes done it tomain
.