Representation of uncertainty in JSONs #386

jameshadfield · 2019-10-25T00:39:02Z

Currently uncertainty in a trait, e.g. location for node X, is represented in augur along the lines of:

/* traits.json */
{X: {location: "blue", location_confidence: {blue: 1.0}}}
/* v1 tree JSON */
{strain: "X", attr: {location: "blue", location_confidence: {blue: 1.0}}}
/* v2 JSON */
{name: "X", node_attrs: {location: {value: "blue", confidence: {blue: 1.0}}}}

Temporal confidence is slightly different formatting, but conceptually identical. This is independent of the model employed.

Importantly, if node X had location "blue" (via metadata) then the output is indistinguishable to if it was inferred with 100% confidence as being in location "blue".

For this example ☝️ all nodes would look like X above, and auspice wouldn't know whether to say "Node A: inferred as blue with 100% confidence" or "Node A: blue". This is even more problematic with tip sampling dates, where we have some code in auspice to try to guess the true meaning:

if (date && dateUncertainty && dateUncertainty[0] !== dateUncertainty[1]) {

Proposed solution

Modify augur traits and augur refine to produce output where non-inferred nodes do not have associated confidences. This will then be carried through augur export {v1,v2}. Auspice's v1->v2 JSON conversion function implement the code above to remove confidence values for tips it believes aren't inferred.

The text was updated successfully, but these errors were encountered:

This can be much improved upon resolution of nextstrain/augur#386. See that issue for more information.

Currently, the way augur exports confidence values for tips, it's largely impossible to know if a tip's trait which has 100% confidence is inferred or known (i.e. defined by the metadata). Since the majority of tips for which DTA is run have data, we assume that the value is provided. This can be much improved upon resolution of nextstrain/augur#386.

rneher · 2019-10-26T10:36:33Z

The issue I see here is that time tree confidences are not always inferred (for performance reasons). But augur refine exports raw-date and that could be compared to the inferred date. Similarly, traits could write the input value into the json if it exists. I would prefer this to signal inference through absence of confidence values.

jameshadfield added a commit to nextstrain/auspice that referenced this issue Oct 25, 2019

interpret certain trait values on tips as known not inferred

6b9b5c0

This can be much improved upon resolution of nextstrain/augur#386. See that issue for more information.

huddlej added the needs triage Needs triage by a Nextstrain team member label Jul 4, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Representation of uncertainty in JSONs #386

Representation of uncertainty in JSONs #386

jameshadfield commented Oct 25, 2019 •

edited

Loading

rneher commented Oct 26, 2019

Representation of uncertainty in JSONs #386

Representation of uncertainty in JSONs #386

Comments

jameshadfield commented Oct 25, 2019 • edited Loading

Proposed solution

rneher commented Oct 26, 2019

jameshadfield commented Oct 25, 2019 •

edited

Loading