-
Notifications
You must be signed in to change notification settings - Fork 128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ability to export branch labels #720
Comments
One implementation would be to look for a top-level "branch_labels" key in the node-data JSON. For instance, we could modify "nodes": {
"NODE_0000549": {
"clade_membership": "20A"
}
},
"branch_labels": {
"NODE_0000549": {
"clade": "20A",
}
} This would imply that
Which would remove the need for |
jameshadfield
added a commit
that referenced
this issue
May 27, 2021
Previously branch labels could not be specified in data passed to `augur export v2` except for two "special cases": (1) Mutations (stored in node-data-json -> nodes) would create branch labels "aa", if applicable. (2) `clade_annotation` (stored in node-data-json -> nodes) was interpreted to be the "clade" branch label, and exported as such. Here we extend the allowed node-data structure to include a top-level key `branch_labels` as described in [1]. This data is exported in the appropriate format for Auspice. This provides an way for analyses to export custom branch labels other than the two special- cases described above. (Currently no augur commands can produce such data, but this will change. See [1] for more.) [1] #720
jameshadfield
added a commit
that referenced
this issue
May 27, 2021
Previously the `augur clades` command produced a node-data JSON which stored clade membership as the node-trait "clade_membership" and defined the basal nodes of each clade with the node-trait "clade_annotation". `augur export v2` interpreted the latter as a special-case and produced a branch label with the same name. The previous commit allowed `augur export` to be supplied node-data JSONs with a `branch_labels` structure. Here we update `augur clades` to export data in this structure, which allows the user to specify the keys to use. To preserve backwards compatibility if neither key is specified, we use the previously hardcoded key names, thus allowing workflows to complete without needing to be updated. Closes #720
jameshadfield
added a commit
that referenced
this issue
May 28, 2021
Previously the `augur clades` command produced a node-data JSON which stored clade membership as the node-trait "clade_membership" and defined the basal nodes of each clade with the node-trait "clade_annotation". `augur export v2` interpreted the latter as a special-case and produced a branch label with the same name. The previous commit allowed `augur export` to be supplied node-data JSONs with a `branch_labels` structure. Here we update `augur clades` to export data in this structure, which allows the user to specify the keys to use. To preserve backwards compatibility if neither key is specified, we default to trait-name="clade_membership" and label-name="clade, which will be exported from `augur export v2` correctly without needing any configuration changes. Closes #720
jameshadfield
added a commit
that referenced
this issue
Jun 10, 2021
Previously the `augur clades` command produced a node-data JSON which stored clade membership as the node-attr "clade_membership" and defined the basal nodes of each clade with the node-attr "clade_annotation". `augur export v2` interpreted the latter as a special-case and turned it into a branch label of the same name. The previous commit allowed `augur export` to be supplied node-data JSONs with a `branch_labels` structure. Here we update `augur clades` to export data in this structure, which allows the user to specify the keys to use via the `--attribute-name` arg. This commit breaks backwards compatibility for pipelines as the default attribute name is "clade". This will result in dataset (auspice) JSONs with the same branch labelling as before, but with a different node-attr (was "clade_membership", now "clade"). As `augur export v2` will make colorings for all node-attrs in in node-data JSONs, this will be exported as a "clade" coloring with no changes needed, however auspice config JSONs may now refer to a non-existent "clade_membership" key. `augur export v2` has been updated to no longer special-case `clade_membership` or `clade_annotation` node attrs. We print a warning if an auspice config JSON refers to `clade_membership` to help users update their configs. Functional tests for `augur clades` have been added. Closes #720
jameshadfield
added a commit
that referenced
this issue
Jun 15, 2021
Previously branch labels could not be specified in data passed to `augur export v2` except for two "special cases": (1) Mutations (stored in node-data-json -> nodes) would create branch labels "aa", if applicable. (2) `clade_annotation` (stored in node-data-json -> nodes) was interpreted to be the "clade" branch label, and exported as such. Here we extend the allowed node-data structure to include a top-level key `branch_labels` as described in [1]. This data is exported in the appropriate format for Auspice. This provides an way for analyses to export custom branch labels other than the two special- cases described above. (Currently no augur commands can produce such data, but this will change. See [1] for more.) [1] #720
jameshadfield
added a commit
that referenced
this issue
Jun 15, 2021
Previously the `augur clades` command produced a node-data JSON which stored clade membership as the node-attr "clade_membership" and defined the basal nodes of each clade with the node-attr "clade_annotation". `augur export v2` interpreted the latter as a special-case and turned it into a branch label of the same name. The previous commit allowed `augur export` to be supplied node-data JSONs with a `branch_labels` structure. Here we update `augur clades` to export data in this structure, which allows the user to specify the keys to use via the `--attribute-name` arg. This commit breaks backwards compatibility for pipelines as the default attribute name is "clade". This will result in dataset (auspice) JSONs with the same branch labelling as before, but with a different node-attr (was "clade_membership", now "clade"). As `augur export v2` will make colorings for all node-attrs in in node-data JSONs, this will be exported as a "clade" coloring with no changes needed, however auspice config JSONs may now refer to a non-existent "clade_membership" key. `augur export v2` has been updated to no longer special-case `clade_membership` or `clade_annotation` node attrs. We print a warning if an auspice config JSON refers to `clade_membership` to help users update their configs. Functional tests for `augur clades` have been added. Closes #720
jameshadfield
added a commit
that referenced
this issue
Jun 15, 2021
Previously branch labels could not be specified in data passed to `augur export v2` except for two "special cases": (1) Mutations (stored in node-data-json -> nodes) would create branch labels "aa", if applicable. (2) `clade_annotation` (stored in node-data-json -> nodes) was interpreted to be the "clade" branch label, and exported as such. Here we extend the allowed node-data structure to include a top-level key `branches` as described in [1] and the test data added here [2]. This data is exported in the appropriate format for Auspice (unchanged). This provides an way for analyses to export custom branch labels other than the two special-cases described above. Note that currently no augur commands can produce such data, but this will change - see [1] for more. This work also induced two smaller changes. The auspice config JSON schema is extended the default branch label displayed to be any value. Secondly, the requirement for node-data JSONs to specify "nodes" has been relaxed (see [2] for an example); if neither "nodes" nor "branches" are defined then we raise a validation error. [1] #720 [2] ./tests/functional/export_v2/branch-labels.json
jameshadfield
added a commit
that referenced
this issue
Jun 15, 2021
Previously the `augur clades` command produced a node-data JSON which stored clade membership as the node-attr "clade_membership" and defined the basal nodes of each clade with the node-attr "clade_annotation". `augur export v2` interpreted the latter as a special-case and turned it into a branch label of the same name. The previous commit allowed `augur export` to be supplied node-data JSONs with a `branches` dictionary. Here we update `augur clades` to export data in this structure, which allows the user to specify the keys to use via the `--attribute-name` arg. This commit breaks backwards compatibility for pipelines as the default attribute name is "clade". This will result in dataset (auspice) JSONs with the same branch labelling as before, but with a different node-attr (was "clade_membership", now "clade"). As `augur export v2` will make colorings for all node-attrs in in node-data JSONs, this will be exported as a "clade" coloring with no changes needed, however auspice config JSONs may now refer to a non-existent "clade_membership" key. `augur export v2` has been updated to no longer special-case `clade_membership` or `clade_annotation` node attrs. We print a warning if an auspice config JSON refers to `clade_membership` to help users update their configs. Functional tests for `augur clades` have been added. Closes #720
jameshadfield
added a commit
that referenced
this issue
Jun 15, 2021
Previously branch labels could not be specified in data passed to `augur export v2` except for two "special cases": (1) Mutations (stored in node-data-json -> nodes) would create branch labels "aa", if applicable. (2) `clade_annotation` (stored in node-data-json -> nodes) was interpreted to be the "clade" branch label, and exported as such. Here we extend the allowed node-data structure to include a top-level key `branches` as described in [1] and the test data added here [2]. This data is exported in the appropriate format for Auspice (unchanged). This provides an way for analyses to export custom branch labels other than the two special-cases described above. Note that currently no augur commands can produce such data, but this will change - see [1] for more. This work also induced two smaller changes. The auspice config JSON schema is extended the default branch label displayed to be any value. Secondly, the requirement for node-data JSONs to specify "nodes" has been relaxed (see [2] for an example); if neither "nodes" nor "branches" are defined then we raise a validation error. [1] #720 [2] ./tests/functional/export_v2/branch-labels.json
jameshadfield
added a commit
that referenced
this issue
Jun 15, 2021
Previously the `augur clades` command produced a node-data JSON which stored clade membership as the node-attr "clade_membership" and defined the basal nodes of each clade with the node-attr "clade_annotation". `augur export v2` interpreted the latter as a special-case and turned it into a branch label of the same name. The previous commit allowed `augur export` to be supplied node-data JSONs with a `branches` dictionary. Here we update `augur clades` to export data in this structure, which allows the user to specify the keys to use via the `--attribute-name` arg. This commit breaks backwards compatibility for pipelines as the default attribute name is "clade". This will result in dataset (auspice) JSONs with the same branch labelling as before, but with a different node-attr (was "clade_membership", now "clade"). As `augur export v2` will make colorings for all node-attrs in in node-data JSONs, this will be exported as a "clade" coloring with no changes needed, however auspice config JSONs may now refer to a non-existent "clade_membership" key. `augur export v2` has been updated to no longer special-case `clade_membership` or `clade_annotation` node attrs. We print a warning if an auspice config JSON refers to `clade_membership` to help users update their configs. Functional tests for `augur clades` have been added. Closes #720
jameshadfield
added a commit
that referenced
this issue
Sep 9, 2022
Previously branch labels could not be specified in data passed to `augur export v2` except for two "special cases": (i) AA mutations (stored in node-data-json -> nodes) would create branch labels "aa", if applicable. (ii) `clade_annotation` (stored in node-data-json -> nodes) was interpreted to be the "clade" branch label, and exported as such. Here we extend the allowed node-data structure to include a top-level key `branches` as described in [1] and the test data added here [2]. This data is exported in the appropriate format for Auspice (unchanged). This paves the way for pipelines to define a range of branch labels for export. Currently the only usable key in this dict is 'labels'. If a branch label (via node-data-json -> branches -> node_name -> label) is provided for 'aa' or 'clade' then this will overwrite the values generated above (i, ii). A side-effect of this work is that the requirement for node-data JSONs to specify "nodes" has been relaxed (see [2] for an example); however if neither "nodes" nor "branches" are defined then we raise a validation error. [1] #720 [2] ./tests/functional/export_v2/branch-labels.json
jameshadfield
added a commit
that referenced
this issue
Sep 12, 2022
Previously branch labels could not be specified in data passed to `augur export v2` except for two "special cases": (i) AA mutations (stored in node-data-json -> nodes) would create branch labels "aa", if applicable. (ii) `clade_annotation` (stored in node-data-json -> nodes) was interpreted to be the "clade" branch label, and exported as such. Here we extend the allowed node-data structure to include a top-level key `branches` as described in [1] and the test data added here [2]. This data is exported in the appropriate format for Auspice (unchanged). This paves the way for pipelines to define a range of branch labels for export. Currently the only usable key in this dict is 'labels'. If a branch label (via node-data-json -> branches -> node_name -> label) is provided for 'aa' or 'clade' then this will overwrite the values generated above (i, ii). A side-effect of this work is that the requirement for node-data JSONs to specify "nodes" has been relaxed (see [2] for an example); however if neither "nodes" nor "branches" are defined then we raise a validation error. [1] #720 [2] ./tests/functional/export_v2/branch-labels.json
jameshadfield
added a commit
that referenced
this issue
Apr 11, 2023
Previously branch labels could not be specified in data passed to `augur export v2` except for two "special cases": (i) AA mutations (stored in node-data-json -> nodes) would create branch labels "aa", if applicable. (ii) `clade_annotation` (stored in node-data-json -> nodes) was interpreted to be the "clade" branch label, and exported as such. Here we extend the allowed node-data structure to include a top-level key `branches` as described in [1] and the test data added here [2]. This data is exported in the appropriate format for Auspice (unchanged). This paves the way for pipelines to define a range of branch labels for export. Currently the only usable key in this dict is 'labels'. If a branch label (via node-data-json -> branches -> node_name -> label) is provided for 'aa' or 'clade' then this will overwrite the values generated above (i, ii). A side-effect of this work is that the requirement for node-data JSONs to specify "nodes" has been relaxed (see [2] for an example); however if neither "nodes" nor "branches" are defined then we raise a validation error. [1] #720 [2] ./tests/functional/export_v2/branch-labels.json
github-project-automation
bot
moved this from In Review
to Done
in Nextstrain planning (archived)
May 4, 2023
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Currently there is no ability for
augur export v2
to export custom branch labels.In general,
node-data.json
defines traits for nodes via the following structure,which
augur export v2
maps onto nodes as such:There are two "special-case" situations which are relevant here:
if
TRAIT_NAME == "clade_annotation"
then augur will export this as a branch label rather than a node_attr. "clade_annotation" is typically produced byaugur clades
and is how we get clade labelling in most of our datasets.augur export v2
automatically creates a branch label for "aa" if a node-data file with "aa_muts" is provided.These two cases are the only time that augur export adds information to the
branch_attrs
of a node. This means that there is no ability foraugur export v2
to set custom branch labels for (internal) nodes.The text was updated successfully, but these errors were encountered: