Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Representation of uncertainty in JSONs #386

Open
jameshadfield opened this issue Oct 25, 2019 · 1 comment
Open

Representation of uncertainty in JSONs #386

jameshadfield opened this issue Oct 25, 2019 · 1 comment
Labels
needs triage Needs triage by a Nextstrain team member

Comments

@jameshadfield
Copy link
Member

jameshadfield commented Oct 25, 2019

Currently uncertainty in a trait, e.g. location for node X, is represented in augur along the lines of:

/* traits.json */
{X: {location: "blue", location_confidence: {blue: 1.0}}}
/* v1 tree JSON */
{strain: "X", attr: {location: "blue", location_confidence: {blue: 1.0}}}
/* v2 JSON */
{name: "X", node_attrs: {location: {value: "blue", confidence: {blue: 1.0}}}}

Temporal confidence is slightly different formatting, but conceptually identical. This is independent of the model employed.

Importantly, if node X had location "blue" (via metadata) then the output is indistinguishable to if it was inferred with 100% confidence as being in location "blue".

image
For this example ☝️ all nodes would look like X above, and auspice wouldn't know whether to say "Node A: inferred as blue with 100% confidence" or "Node A: blue". This is even more problematic with tip sampling dates, where we have some code in auspice to try to guess the true meaning:

if (date && dateUncertainty && dateUncertainty[0] !== dateUncertainty[1]) {

Proposed solution

Modify augur traits and augur refine to produce output where non-inferred nodes do not have associated confidences. This will then be carried through augur export {v1,v2}. Auspice's v1->v2 JSON conversion function implement the code above to remove confidence values for tips it believes aren't inferred.

jameshadfield added a commit to nextstrain/auspice that referenced this issue Oct 25, 2019
This can be much improved upon resolution of nextstrain/augur#386. See that issue for more information.
jameshadfield added a commit to nextstrain/auspice that referenced this issue Oct 25, 2019
Currently, the way augur exports confidence values for tips, it's largely impossible to know if a tip's trait which has 100% confidence is inferred or known (i.e. defined by the metadata). Since the majority of tips for which DTA is run have data, we assume that the value is provided.

This can be much improved upon resolution of nextstrain/augur#386.
@rneher
Copy link
Member

rneher commented Oct 26, 2019

The issue I see here is that time tree confidences are not always inferred (for performance reasons). But augur refine exports raw-date and that could be compared to the inferred date. Similarly, traits could write the input value into the json if it exists. I would prefer this to signal inference through absence of confidence values.

@huddlej huddlej added the needs triage Needs triage by a Nextstrain team member label Jul 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs triage Needs triage by a Nextstrain team member
Projects
None yet
Development

No branches or pull requests

3 participants