-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Explore options for weekly data #117
Comments
This seems relatively uncomplicated for a static CFR estimate, by adding automatic interval detection to Would be great if @adamkucharski or @sbfnk could confirm that this is statistically sound. library(cfr)
library(lubridate)
#>
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#>
#> date, intersect, setdiff, union
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(distcrete)
# summarise data by week
df = ebola1976 |>
mutate(week = week(date)) |>
group_by(week) |>
summarise(
cases = sum(cases),
deaths = sum(deaths),
date = first(date)
)
# prepare discrete distribution with appropriate interval
f <- distcrete(
name = "gamma", shape = 2.40, scale = 3.33, interval = 7
)
# edit first row to satisfy checks of regularity
df$date[1] = df$date[2] - 7
# check CFR estimate for weekly data, with interval calculated internally
cfr_static(
data = df,
delay_density = f$d
)
#> severity_mean severity_low severity_high
#> 1 0.955 0.838 1
# CFR for daily data
cfr_static(
ebola1976,
delay_density = distcrete(
name = "gamma", shape = 2.40, scale = 3.33, interval = 1
)$d
)
#> severity_mean severity_low severity_high
#> 1 0.959 0.842 1 Created on 2024-01-19 with reprex v2.0.2 |
I suspect a more robust approach would be to take the weekly counts and impute daily ones, then run the CFR method as-is with a delay distribution defined in terms of days (e.g. using this function from EpiEstim, albeit not one that's on CRAN yet: #76 (comment)). Or is there some functionality in EpiNow that can do this @sbfnk ? Basically taking weekly case data and converting to daily cases with some interpolation based on generative model? Otherwise we have the issue of whether the discretised delay distribution above (which is defined based on time since onset) lines up with the discretised weekly counts (which are defined in arbitrary intervals relative to the underlying infection events). Seems neater (and probably easier to users to interpret assumptions) to just get everything on the same daily scale. |
EpiNow2 could infer daily time series from weekly (using a Gaussian Process) once this recent PR is merged - it could then also be used to estimate CFR with weekly outcome data directly. It can't currently estimate CFR using where both primary and outcome data are weekly but should be able to once (if ever) it can fit to multiple time series. |
Thanks both - must have missed these comments earlier. Do we prefer to leave weekly CFR estimation to EpiNow2, or is there something {cfr} could/should add? |
I think a brief section in vignette showing how it could be done with the EpiEstim example above (or EpiNow2 if easy) would be sufficient for now, rather than building functionality for this directly into CFR |
This is now being addressed outside of {cfr} (i.e. doing the aggregation outside then reading in). So it's probably better kept as an applied case study, as discussed in this case study. |
This issue is to explore options for delay corrected CFR calculations on weekly data, using a discrete distribution where the interval is set correctly (to 7 days).
The text was updated successfully, but these errors were encountered: