Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

359: Check PIT #916

Merged
merged 9 commits into from
Sep 21, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 5 additions & 15 deletions R/pit.R
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@
#' forecasts \eqn{F_t} are said to be ideal if \eqn{F_t = G_t} at all times t.
#' In that case, the probabilities \eqn{u_t} are distributed uniformly.
#'
#' In the case of discrete outcomes such as incidence counts,
#' In the case of discrete nonnegative outcomes such as incidence counts,
#' the PIT is no longer uniform even when forecasts are ideal.
#' In that case a randomised PIT can be used instead:
#' \deqn{
Expand All @@ -37,19 +37,6 @@
#' of k. If \eqn{P_t} is the true cumulative
#' probability distribution, then \eqn{u_t} is standard uniform.
#'
#' The function checks whether integer or continuous forecasts were provided.
#' It then applies the (randomised) probability integral and tests
#' the values \eqn{u_t} for uniformity using the
#' Anderson-Darling test.
#'
#' As a rule of thumb, there is no evidence to suggest a forecasting model is
#' miscalibrated if the p-value found was greater than a threshold of p >= 0.1,
#' some evidence that it was miscalibrated if 0.01 < p < 0.1, and good
#' evidence that it was miscalibrated if p <= 0.01. However, the AD-p-values
#' may be overly strict and there actual usefulness may be questionable.
#' In this context it should be noted, though, that uniformity of the
#' PIT is a necessary but not sufficient condition of calibration.
#'
#' @param n_replicates The number of draws for the randomised PIT for
#' discrete predictions. Will be ignored if forecasts are continuous.
#' @inheritParams ae_median_sample
Expand Down Expand Up @@ -78,6 +65,9 @@
#' plot_pit(pit)
#' @export
#' @references
#' Claudia Czado, Tilmann Gneiting Leonhard Held (2009) Predictive model
#' assessment for count data. Biometrika, 96(4), 633-648.
#
#' Sebastian Funk, Anton Camacho, Adam J. Kucharski, Rachel Lowe,
#' Rosalind M. Eggo, W. John Edmunds (2019) Assessing the performance of
#' real-time epidemic forecasts: A case study of Ebola in the Western Area
Expand Down Expand Up @@ -194,7 +184,7 @@ get_pit.forecast_quantile <- function(forecast, by, ...) {
forecast[, quantile_coverage := (observed <= predicted)]
quantile_coverage <-
forecast[, .(quantile_coverage = mean(quantile_coverage)),
by = c(unique(c(by, "quantile_level")))]
by = c(unique(c(by, "quantile_level")))]
quantile_coverage <- quantile_coverage[order(quantile_level),
.(
quantile_level = c(quantile_level, 1),
Expand Down
20 changes: 5 additions & 15 deletions R/plot.R
Original file line number Diff line number Diff line change
Expand Up @@ -494,21 +494,21 @@ plot_pit <- function(pit,

# use breaks if explicitly given, otherwise assign based on number of bins
if (!is.null(breaks)) {
plot_quantiles <- breaks
plot_quantiles <- unique(c(0, breaks, 1))
} else if (is.null(num_bins) || num_bins == "auto") {
# automatically set number of bins
if (type == "sample-based") {
num_bins <- 10
width <- 1 / num_bins
plot_quantiles <- seq(width, 1, width)
plot_quantiles <- seq(0, 1, width)
}
if (type == "quantile-based") {
plot_quantiles <- unique(pit$quantile_level)
plot_quantiles <- unique(c(0, pit$quantile_level, 1))
}
} else {
# if num_bins is explicitly given
width <- 1 / num_bins
plot_quantiles <- seq(width, 1, width)
plot_quantiles <- seq(0, 1, width)
}

# function for data.frames
Expand All @@ -518,21 +518,11 @@ plot_pit <- function(pit,

# quantile version
if (type == "quantile-based") {
if (num_bins == "auto") {
} else {
width <- 1 / num_bins
plot_quantiles <- seq(width, 1, width)
}

if (!is.null(breaks)) {
plot_quantiles <- breaks
}

hist <- ggplot(
data = pit[quantile_level %in% plot_quantiles],
aes(x = quantile_level, y = pit_value)
) +
geom_col(position = "dodge") +
geom_col(position = "dodge", colour = "grey") +
facet_wrap(formula)
}

Expand Down
2 changes: 1 addition & 1 deletion man/get_metrics.forecast_point.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

17 changes: 3 additions & 14 deletions man/pit_sample.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading