Skip to content

Commit

Permalink
Merge pull request #136 from signaturescience/v0.0.3
Browse files Browse the repository at this point in the history
v0.0.3 release
  • Loading branch information
vpnagraj authored Apr 24, 2024
2 parents a06dbc6 + 49c3c4d commit a7ec413
Show file tree
Hide file tree
Showing 11 changed files with 74 additions and 17 deletions.
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Package: rplanes
Title: Plausibility Analysis of Epidemiological Signals in 'R'
Version: 0.0.2
Version: 0.0.3
Authors@R:
c(person(given = "VP",
family = "Nagraj",
Expand Down
22 changes: 22 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,25 @@
# rplanes 0.0.3

## New features

### Better handling of locations with all missing data

In the previous version of the package, if a user input signal data that included a location with all values missing the `plane_seed()` function would proceed. However, this would lead to background characteristics in the seed that could not be used in downstream algorithms (e.g., infinite range). We now trigger an error if the input data for `to_signal()` includes any locations with all values missing.

### More intuitive `plane_repeat()` behavior

The PLANES scoring includes `plane_repeat()` to implement a "repeat" algorithm (i.e., checking if the evaluated signal creates a repeat sequence longer than any previously observed in the seed). We observed that this was flagging instances where all values of the time series were the same. In this release we have adjusted the algorithm to no longer flag a constant time series as implausible.

### Weighting scheme constraints

In this release, we have introduced a new feature to constrain component weights passed to `plane_score()` at values >= 1. Before adding this constraint, we saw inconsistent behavior in some cases when weights were set a < 1. We have updated the function documentation for `plane_score()` "weights" argument to reflect this change.

## Bug fixes

### Documentation typos

This release introduces minor fixes for typos in function documentation and the README.

# rplanes 0.0.2

## New features
Expand Down
30 changes: 20 additions & 10 deletions R/planes.R
Original file line number Diff line number Diff line change
Expand Up @@ -248,7 +248,7 @@ plane_taper <- function(location, input, seed) {
#'
#' @description
#'
#' This function evaluates whether consecutive values in observations or forecasts are repeated a k number of times. This function takes in a [forecast][to_signal()] or [observed][to_signal()] object that is either from an observed dataset or forecast dataset.
#' This function evaluates whether consecutive values in observations or forecasts are repeated a k number of times. This function takes in a [forecast][to_signal()] or [observed][to_signal()] object that is either from an observed dataset or forecast dataset. Note that if a signal is contant (i.e., the same value is repeated for all time points) then the repeat component will return `FALSE`.
#'
#' @param location Character vector with location code; the location must appear in input and seed
#' @param input Input signal data to be scored; object must be one of [forecast][to_signal()] or [observed][to_signal()]
Expand Down Expand Up @@ -367,9 +367,16 @@ plane_repeat <- function(location, input, seed, tolerance = NULL, prepend = NULL
dplyr::filter(.data$n_repeats >= k) %>%
dplyr::select(-"repeat_id", -"n_repeats", -"prepend_type")

## indicator for whether or not the number of rows is > 0
## this would indicate that there are repeats
ind <- nrow(repeat_tbl) > 0
## logic to check if data is constant in the reported signal
## if so ... we cannot fairly say there is a repeat so set to FALSE
if(length(unique(tmp_seed$all_values)) == 1) {
ind <- FALSE
## if not ... look at the repeat table to determine if flag is raised
} else {
## indicator for whether or not the number of rows is > 0
## this would indicate that there are repeats
ind <- nrow(repeat_tbl) > 0
}

## return list with indicator and info
return(list(indicator = ind, repeats = repeat_tbl))
Expand All @@ -386,7 +393,7 @@ plane_repeat <- function(location, input, seed, tolerance = NULL, prepend = NULL
#' @param seed Prepared [seed][plane_seed()]
#' @param components Character vector specifying component; must be either `"all"` or any combination of `"cover"`, `"diff"`, `"taper"`, `"trend"`, `"repeat"`, `"shape"`, and `"zero"`; default is `"all"` and will use all available components for the given signal
#' @param args Named list of arguments for component functions. List elements must be named to match the given component and arguments passed as a nested list (e.g., `args = list("trend" = list("sig_lvl" = 0.05))`). Default is `NULL` and defaults for all components will be used
#' @param weights Named vector with weights to be applied; default is `NULL` and all components will be equally weighted; if not `NULL` then the length of the vector must equal the number of components, with each component given a numeric weight (see Examples)
#' @param weights Named vector with weights to be applied; default is `NULL` and all components will be equally weighted; if not `NULL` then the length of the vector must equal the number of components, with each component given a numeric weight (see Examples). Specified weights must be real numbers greater than or equal to 1.
#'
#'
#'
Expand Down Expand Up @@ -424,7 +431,7 @@ plane_repeat <- function(location, input, seed, tolerance = NULL, prepend = NULL
#'
#' ## run plane scoring with specific components and weights
#' comps <- c("cover", "taper", "diff")
#' wts <- c("cover" = 2, "taper" = 1, "diff" = 4)
#' wts <- c("cover" = 1.5, "taper" = 1, "diff" = 4)
#' plane_score(input = prepped_forecast, seed = prepped_seed, components = comps, weights = wts)
#'
#' }
Expand Down Expand Up @@ -504,14 +511,17 @@ plane_score <- function(input, seed, components = "all", args = NULL, weights =
## construct a tibble with weights for components
## if the weights argument is NULL then apply equal weights to all components
if(is.null(weights)) {
weights_tbl <-
dplyr::tibble(component = components, weight = 1)
weights_tbl <- dplyr::tibble(component = components, weight = 1)
} else {
if(any(weights < 1)) {
stop("Weights must be a real number >= 1")
}

if(!all(sort(names(weights)) == sort(components))) {
stop("Weights must be provided as a vector with all components used included by name (e.g., c('diff' = 4, 'cover' = 1))")
}
weights_tbl <-
dplyr::tibble(component = names(weights), weight = weights)

weights_tbl <- dplyr::tibble(component = names(weights), weight = weights)
}

## convert the tibble into a list
Expand Down
4 changes: 4 additions & 0 deletions R/seed.R
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,10 @@ seed_engine <- function(input, location, cut_date=NULL) {
tmp_data %>%
dplyr::pull(input$outcome)

if(all(is.na(tmp_obs))) {
stop(sprintf("The values for the signal selected are missing for all dates in the following location: %s. Cannot create seed characteristics. Remove any locations that are missing signal input data for all dates prior to seeding.", location))
}

## return max diff
max_diff <-
(tmp_obs - dplyr::lag(tmp_obs)) %>%
Expand Down
2 changes: 1 addition & 1 deletion README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ The `rplanes` package (**pl**ausibility **an**alysis of **e**pidemiological **s*
## Installation

``` r
#install.pacakges("remotes")
#install.packages("remotes")
remotes::install_github("signaturescience/rplanes", build_vignettes=TRUE)
```

Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ independently and then presented as a combined score.
## Installation

``` r
#install.pacakges("remotes")
#install.packages("remotes")
remotes::install_github("signaturescience/rplanes", build_vignettes=TRUE)
```

Expand Down
2 changes: 1 addition & 1 deletion inst/app/app.R
Original file line number Diff line number Diff line change
Expand Up @@ -145,7 +145,7 @@ server <- function(input, output, session){
## value will be set at 1 by default
## id will be weight_ component to make it easy to find / parse as input for plane_score
components() %>%
purrr::map(., function(x) numericInput(inputId = paste0("weight_",x), label = paste0("Weight: ", tcase(x)), value = 1))
purrr::map(., function(x) numericInput(inputId = paste0("weight_",x), label = paste0("Weight: ", tcase(x)), value = 1, min = 1))
})

# pass in actionBttn to module plots
Expand Down
2 changes: 1 addition & 1 deletion man/plane_repeat.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions man/plane_score.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 3 additions & 0 deletions tests/testthat/test-plane-components.R
Original file line number Diff line number Diff line change
Expand Up @@ -257,6 +257,9 @@ test_that("plane_score handles weights", {
expect_error(plane_score(prepped_forecast, prepped_seed, components = c("diff","repeat"), weights = c("diff" = 4, "foo"= 1)))
expect_error(plane_score(prepped_forecast, prepped_seed, components = c("diff","repeat"), weights = c("diff" = 4, "cover"= 1)))

## check that weights are enforced to be >= 1
expect_error( plane_score(prepped_forecast, prepped_seed, components = c("diff","repeat"), weights = c("diff" = 0.5, "repeat"= 1)))

})

test_that("plane_trend flags known changepoints and is sensitive to changes in sig.lvl", {
Expand Down
18 changes: 18 additions & 0 deletions tests/testthat/test-seed.R
Original file line number Diff line number Diff line change
Expand Up @@ -53,3 +53,21 @@ test_that("plane_seed gets last value by date even if dates are not ordered asce

})


test_that("plane_seed handles missing data as expected", {

## create a location where all values for all dates
missing_dat <-
dplyr::tibble(date = unique(hosp$date),
location = "Bermuda Triangle",
flu.admits = NA)

## use tmp_hosp data above ... and bind in missing data
tst_signal <-
rbind(tmp_hosp,missing_dat) %>%
to_signal(., outcome = "flu.admits", type = "observed", resolution = "weeks")

## should generate a stop() with seed given one location with all missing
expect_error(plane_seed(tst_signal, cut_date = "2023-03-25"))

})

0 comments on commit a7ec413

Please sign in to comment.