-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#447 - implement babelfish for VCF #660
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comment about a comment. Otherwise, looks great. Thanks!
… that shadows builtin
…apper from class. People can do that themselves and control whether they want validation etc
Hi @reece are you able to review? Afterwards I'll start on the parser and do as a separate PR |
…lt. Add to test case ability to test alternative VCFs that go to same g. HGVS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. Thanks @davmlaw . Sorry for the long delay.
#447 - Implement hgvs babelfish for VCF
Also fixes
#616 - Babelfish hardcoded GRCh38
I changed the test expected VCF results - my reasoning follows:
NC_000006.12:g.49949407=
Old: None
New: ("6", 49949407, "A", ".", 'identity')
Notes: As we can now load VCFs, this is now symmetrical (ie can convert to and back again w/o change). You can represent an identity/no change this way in VCF or by repeating the ref, I went with '.' but feel free to change
The other changes involve VT parsimony rules - https://genome.sph.umich.edu/wiki/Variant_Normalization
"A variant has superfluous nucleotides on its left side if the leftmost nucleotide of each variant is of the same type and the removal of the nucleotide from each allele will not result in an empty allele."
NC_000006.12:g.49949407A>T
Old: ("6", 49949406, "AA", "AT", "sub")
New: ("6", 49949407, "A", "T", "sub")
NC_000006.12:g.49949413_49949414delinsCC
Old: ("6", 49949412, "AAA", "ACC", "delins")
New: ("6", 49949413, "AA", "CC", "delins")