-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Examples of delay estimation from the literature. #19
Comments
This is a really nice early nowcasting paper that implements a model very similar to that in
Censoring comment in the discussion. Doesn't address what to do about truncation. |
|
As a specific example there is also Estimates of the severity of coronavirus disease 2019: a model-based analysis which uses growth rate adjustment of naive delays (similar to what is used in Incubation Period and Other Epidemiological Characteristics of 2019 Novel Coronavirus Infections with Right Truncation: A Statistical Analysis of Publicly Available Case Data). Perhaps it's even worth adding the growth rate adjustment described in Section 2.1 of the supplement (with code) to the scenarios investigated? |
@parksw3 had originally had dynamic adjustment much more front and centre in this work. We have pushed it back a bit as the simulations have got more complex as its a bit hard to implement when growth rates are varying and 2 if they are varying how do you estimate them without a joint model (i.e meaning the post-processing approach is perhaps not ideal). The easiest thing to do is treat them as known but then does that really help people who want to implement these methods. I think the plan is definitely to keep it in (at least in some form). I am currently investigating a simplified form of the forward correction (i.e having the growth rate jointly estimated in model) which should be a bit easier to compare to other approaches (maybe). |
|
Ah you are right. Isn't this L1 according to Shaun's framework where the growth rate is known? L3 is explicitly not joint modelling. Do you have the likelihood for the complete model you used here written down somewhere? Or for that matter the code? Just had a brief browse and can only see code for the other bits of the paper. I totally agree that needing to know the growth rate/assume it is fixed is a limitation I am not really willing to accept. |
I believe this is L3, which I think of as "forward looking", but the essential meaning of "conditional on initial" is the same. This involves conditioning on time of the first event and looking at the distribution of secondary event times. If the first event time is known exactly, all the joint modelling parts cancel out, which is why you don't have any joint modelling. However, if you have interval censoring on the first event time, they no longer cancel out since the g(i) terms fall inside the integrals. Which is why we use the latent variable approach in the Bayesian model, so that the event times are sampled, and then the integrals over "i" disappear and the g(i) terms cancel out, so no joint modelling is required. |
I think as written this screenshot is a touch unclear (but this really doesn't matter) and more in line with the joint approach (i.e L1). I agree if you condition on primary events (and therefore don't model their uncertainty etc) you can call g terms and rewrite the likelihood as done in L3. I agree it can be dropped without censoring or when censoring is otherwise handled. Though as we discussed for longer censorings windows that is no longer trivial. |
Suggestion from @sbfnk to look at Ebola Virus Disease in West Africa — The First 9 Months of the Epidemic and Forward Projections (supplement). They do a lot of distribution estimation aiming to correct for left truncation (they call this censoring but I think it sounds like it isn't (as they apply the correction to all data and not just the censored observations) and daily censoring. They do a daily censoring adjustment by just shifting all the data by half a day. This seems like it should add some bias but be better than doing nothing. I don't want to add more work but perhaps we do need to investigate this as commonly used? I am not totally clear why they have left truncation and it seems like right truncation would be a much much bigger deal in their data given the state of the outbreak when this was published. Perhaps this is a mistake in the equations? I guess this approach makes sense if filtering out recent observations based on delay length but as written this would apply to all short delays (including those far in the past) which seems incorrect. You may want to take a look at their section discussing generation time estimation @parksw3 for other work if you haven't I see nothing in the papers citing this that indicates any mistakes have been flagged but lots and lots of reuse of these distribution estimates for quite "high impact" work so if we do agree there are issues it's a good thing to discuss heavily. |
Example estimating the incubation period of Monkeypox with some mention of censoring but none of truncation: https://www.eurosurveillance.org/content/10.2807/1560-7917.ES.2022.27.24.2200448 Cites this COVID paper for its method details: https://www.eurosurveillance.org/content/10.2807/1560-7917.ES.2020.25.5.2000062#html_fulltext Method details are not in the supplement (its just the main text so very sparse). They do a censoring correction for unknown exposure times but no daily censoring adjustment and no truncation adjustment (see stan code below). They published data and so in theory this is something we could look at as a real-world case study if we so wished (not sure we need to or should)
|
Lauer paper which we made a lot of use of early on (and late on for that matter) as the principle incubation period estimate: https://www.acpjournals.org/doi/10.7326/M20-0504 They used: "using a previously described parametric accelerated failure time model (13)" which reminds me we do need to make the point clearly that this estimation task is best thought about as a time to event (i.e survival problem) and therefore use methods (like we do) from that silo. The actual implementation they used was: Code: https://github.com/HopkinsIDD/ncov_incubation Yup they just use |
In this (https://onlinelibrary.wiley.com/doi/full/10.1111/j.1541-0420.2011.01709.x?saml_referrer) work by Reich et al. they deal with the truncation issue using an EM maximisation approach (that seems fine) for CFR estimation. They don't do anything about the delay they are actually using being truncated and it appears in general there is no functionality in |
We haven't really discused where this paper fits which we maybe should: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0257978. |
Growth rate correction being used in the wild: https://www.mdpi.com/2077-0383/9/2/538 |
The addition on the incubation period is nice. We've currently added this to our forward looking (L3) approach instead of the joint approach (L2 I think?). The joint approach was horribly slow since evaluating the integral on the denominator is a pain in stan. An issue with the incubation period approach is that is isn't quite correct, as there is an uncorrected epidemic phase bias in there. I think it is possible to correct this, but you need to add a backcalculation of infection incidence, which we've not tried to implement. If the delay of interest is much longer than the incubation period (e.g. if looking at time to death) then the missed epidemic phase bias is hopefully negligible. But for e.g. time from onset-to-testing, the incubation period is likely to be longer, so the magnitude of the missed epidemic phase bias is probably larger than the epidemic phase bias we're putting lots of effort in to correct for the onset-to-testing delay. |
I looked up |
Sounds like you were thinking along the same lines as @parksw3 and I! Looking forward to seeing your work on this. Also I guess this ends up being similar to using a latent delay in an |
via @sbfnk: "Estimating the serial intervals of SARS-CoV-2 Omicron BA.4, It uses the fixed growth rate truncation adjustment approach but with sensitivity analysis on the growth rate (I think a method that uses a prior here would help people if we feel like supplying it). It also appears to additionally do right truncation adjustment on top of this so is a nice example of this issue for the introduction |
I saw this paper too earlier and thought I already added the paper, but turns out I didn't... oops... this paper also made me wonder whether we need to show somewhere in the SI that accounting for both truncation and growth rate approach is bad. |
yeah I agree but perhaps we can hold off on that whilst we knock everything else into shape. |
good point. Also agree with that. |
Suggested by Shaun Seamen this paper may be useful for discussing approaches for different censoring assumptions: https://onlinelibrary.wiley.com/doi/abs/10.1002/sim.2697 |
A very new study to discuss: https://www.thelancet.com/journals/lanmic/article/PIIS2666-5247(23)00005-8/fulltext Seems to be approaching things from a fairly odd angle but has all the same issues from what I can see |
Applications
Serial interval and incubation period estimation for Monkeypox (also used as part of CDC reporting). Censoring adjustment but not right truncation adjusted. Based on
EpiEstim
which itself usescourseDataTools
(https://github.com/nickreich/coarseDataTools). https://www.medrxiv.org/content/10.1101/2022.10.26.22281516v1.full.pdfcourseDataTools
has a range of linked citations with I think the methods coming from here: https://doi.org/10.1002/sim.3659. From reading it makes use of two doubly censored approaches one of which is a reduction of the other. Its frequentist and I think corresponds to our simply censoring approach of the latent approach without truncation (assuming uniform priors). They simulate using a uniform prior for the day of the primary event, a log normal for the distribution and then censor the secondary event (so no phase bias issues in their simulation). They explored diurnal (waking day) biased priors and found minimal impact. They also investigated a spiked prior and found this had more impact.Something worth adding to our discussion is this can be done trivially for our approach either via
brms
or using stan directly which is nice.This could be a useful way to frame our exploration of sample size.
courseDataTools
usessurvival
but its own code for the doubly censored model (which assumes uniform censoring).courseDataTools
received a lot of recent usage for reference.Theory
Left truncation + censorings vs naive methods: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7770078/
Understanding an evolving pandemic: An analysis of the clinical time delay distributions of COVID-19 in the United Kingdom: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0257978.
Estimating a time-to-event distribution from right-truncated data in an epidemic: A review of methods: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9465556/
The text was updated successfully, but these errors were encountered: