-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
913: Functions for PIT histograms #949
Conversation
Nice, thank you! Broader thoughts/questions:
|
This would be my suggestion (i.e. just remove the
Whatever we do I think it deserves the same treatment as related functions (e.g.
It never existed. I meant example_sample_continuous |>
as_forecast_sample() |>
get_pit_histogram(by = c("model", "target_type")) |>
ggplot(aes(x = mid, y = density)) +
geom_col() +
facet_grid(target_type ~ model) +
labs(x = "Quantile", "Density") Do you think this should be replicated somewhere else? |
Co-authored-by: Sebastian Funk <[email protected]>
I'm leaning slightly towards a function (I would personally have to think more than 30 seconds about how to plot this + in order to plot this you need to know what a pit histogram needs to look like). As an alternative I would suggest adding the code snippet to the function examples + adding a comment such as "For an example of how to plot the output, see the examples section" in the function docs. |
Thinking about this I think this is a bit more complicated in the new set up because the plot function doesn't necessarily know what to plot / facet (i.e. what was passed to Alternatively we could change |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
Shall we merge it? |
In |
@@ -84,6 +84,7 @@ of our [original](https://doi.org/10.48550/arXiv.2205.07090) `scoringutils` pape | |||
- Removed `interval_coverage_sample()` as users are now expected to convert to a quantile format first before scoring. | |||
- Function `set_forecast_unit()` was deleted. Instead there is now a `forecast_unit` argument in `as_forecast_<type>()` as well as in `get_duplicate_forecasts()`. | |||
- Removed `interval_coverage_dev_quantile()`. Users can still access the difference between nominal and actual interval coverage using `get_coverage()`. | |||
- `pit()`, `pit_sample()` and `plot_pit()` have been removed and replaced by functionality to create PIT histograms (`pit_histogram_sampel()` and `get_pit_histogram()`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: sampel
-> sample
#' @examples | ||
#' library("ggplot2") | ||
#' | ||
#' example <- as_forecast_sample(example_sample_continuous) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't it already of class sample and so this line isn't needed?
pit_sample <- function(observed, | ||
predicted, | ||
n_replicates = 100) { | ||
pit_histogram_sample <- function(observed, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
noting if/when we get rid of get_ then pit_histogram and pit_histogram_sample are quite hard to distinguish.
plot_pit() + | ||
facet_grid(target_type ~ model) | ||
get_pit_histogram(by = c("model", "target_type")) |> | ||
ggplot(aes(x = mid, y = density)) + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd personally prefer we do ship a simpler wrapper around this code (and the labs) to reduce duplication though obviously it isn't very complicated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean to reinstate plot_pit()
, maybe even as S3 function that calls get_pit_histogram()
and then plots? Or something else?
facet_wrap(~ model, nrow = 1) + | ||
theme_scoringutils() | ||
theme_scoringutils() + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this one uses the scoringutils theme and the other one doesn't
Description
This PR closes #913.
It adds
plot_histogram_sample()
pit_histogram_sample()
andget_pit_histogram()
functions and implements the nonrandomised PIT for count data suggested by Czado et al.It also removes
get_pit()
andpit_sample()
- I don't think calculating PIT values themselves is very interesting or something anyone would ever want to do but if people disagree we could keep them in.It also removes
plot_pit()
. Now that densities of the histogram can be obtained directly I would think that a user will be able to make their own plot.Checklist
lintr::lint_package()
to check for style issues introduced by my changes.