Inputs and outputs for mapping #22

samriddhi99 · 2023-08-08T18:57:28Z

Currently the inputs being taken are entire scoresets. Out of all the data in the scoreset, only the urn, target sequence, uniprot, and target type are required for the mapping. In order to make it more efficient, is there a better way to obtain this data, instead of requiring an entire scoreset to be an input?

Additionally, in accordance with the new/anticipated changes in MaveDB, TaxID can be taken as an input to obtain additional required data.

ahwagner · 2023-08-11T02:10:36Z

This software should have as an input a target sequence, a set of variants represented on that sequence, and the sequence alphabet type (nucleic acid, amino acid)

The format for this can be specified by you.

The output format can also be specified by you, but should include:

the mapped sequence and associated metadata (minimally the refget and refseq sequence identifiers)
the mapping relationship (e.g. "homologous_to")
each original variant and its mapped variant

samriddhi99 mentioned this issue Aug 14, 2023

Input-output format #25

Merged

jsstevenson closed this as completed May 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inputs and outputs for mapping #22

Inputs and outputs for mapping #22

samriddhi99 commented Aug 8, 2023

ahwagner commented Aug 11, 2023

Inputs and outputs for mapping #22

Inputs and outputs for mapping #22

Comments

samriddhi99 commented Aug 8, 2023

ahwagner commented Aug 11, 2023