Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IUPAC handling in ExpansionHunter #238

Closed
raphaelbetschart opened this issue Oct 22, 2024 · 1 comment · Fixed by gymrek-lab/EnsembleTR#32
Closed

IUPAC handling in ExpansionHunter #238

raphaelbetschart opened this issue Oct 22, 2024 · 1 comment · Fixed by gymrek-lab/EnsembleTR#32

Comments

@raphaelbetschart
Copy link

Hi,

I'm running into some issues when using EnsembleTR with my ExpansionHunter VCF files.
The main issue is that the reference catalogue includes IUPAC nucleotides (for instance repeat structure AARRG at position chr4:39348424-39348479, taken from https://github.com/Illumina/ExpansionHunter/blob/master/variant_catalog/hg38/variant_catalog.json. This leads to an error in utils.py in the following line:

nucToNumber={"A":0,"C":1,"G":2,"T":3}

Should I just remove this entry or change the reference catalogue manually?
GangSTR and HipSTR catalogue both have the repeat structure AAAAG.

Thanks,
Raphael

@aryarm
Copy link
Member

aryarm commented Dec 6, 2024

Thanks for reporting this issue, @raphaelbetschart! We've made an attempt at fixing it in gymrek-lab/EnsembleTR#32

Can you try the new code and let us know if you're still having trouble by reopening this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants