Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#447 - implement babelfish for VCF #660

Merged
merged 6 commits into from
Sep 10, 2023
Merged

#447 - implement babelfish for VCF #660

merged 6 commits into from
Sep 10, 2023

Conversation

davmlaw
Copy link
Contributor

@davmlaw davmlaw commented Jun 13, 2023

#447 - Implement hgvs babelfish for VCF

Also fixes

#616 - Babelfish hardcoded GRCh38

I changed the test expected VCF results - my reasoning follows:

NC_000006.12:g.49949407=
Old: None
New: ("6", 49949407, "A", ".", 'identity')
Notes: As we can now load VCFs, this is now symmetrical (ie can convert to and back again w/o change). You can represent an identity/no change this way in VCF or by repeating the ref, I went with '.' but feel free to change

The other changes involve VT parsimony rules - https://genome.sph.umich.edu/wiki/Variant_Normalization

"A variant has superfluous nucleotides on its left side if the leftmost nucleotide of each variant is of the same type and the removal of the nucleotide from each allele will not result in an empty allele."

NC_000006.12:g.49949407A>T
Old: ("6", 49949406, "AA", "AT", "sub")
New: ("6", 49949407, "A", "T", "sub")

NC_000006.12:g.49949413_49949414delinsCC
Old: ("6", 49949412, "AAA", "ACC", "delins")
New: ("6", 49949413, "AA", "CC", "delins")

@davmlaw davmlaw requested a review from reece as a code owner June 13, 2023 07:19
Copy link
Member

@reece reece left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment about a comment. Otherwise, looks great. Thanks!

src/hgvs/extras/babelfish.py Outdated Show resolved Hide resolved
tests/test_hgvs_extras_babelfish.py Outdated Show resolved Hide resolved
…apper from class. People can do that themselves and control whether they want validation etc
@davmlaw
Copy link
Contributor Author

davmlaw commented Jun 15, 2023

Hi @reece are you able to review?

Afterwards I'll start on the parser and do as a separate PR

Copy link
Member

@reece reece left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. Thanks @davmlaw . Sorry for the long delay.

@reece reece merged commit 4f838df into biocommons:main Sep 10, 2023
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants