Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot parse HGVS repeats #490

Open
markwoon opened this issue May 16, 2020 · 1 comment
Open

Cannot parse HGVS repeats #490

markwoon opened this issue May 16, 2020 · 1 comment
Labels
enhancement help wanted Help is welcome for this.

Comments

@markwoon
Copy link
Contributor

markwoon commented May 16, 2020

I am getting errors parsing HGVS repeats. For example, attempting to parse NC_000014.8:g.101179660TG[14] gives me:

line 1:25 mismatched input '[' expecting {NT_STRING, NT_MINUS, NT_PLUS}
java.lang.NullPointerException: null
	at de.charite.compbio.jannovar.hgvs.parser.Antlr4HGVSParserListenerImpl.exitNt_change_substitution(Antlr4HGVSParserListenerImpl.java:404)
	at de.charite.compbio.jannovar.hgvs.parser.Antlr4HGVSParser$Nt_change_substitutionContext.exitRule(Antlr4HGVSParser.java:3121)
	at org.antlr.v4.runtime.Parser.triggerExitRuleEvent(Parser.java:408)
	at org.antlr.v4.runtime.Parser.exitRule(Parser.java:642)
	at de.charite.compbio.jannovar.hgvs.parser.Antlr4HGVSParser.hgvs_variant(Antlr4HGVSParser.java:219)
	at de.charite.compbio.jannovar.hgvs.parser.HGVSParser.parseHGVSString(HGVSParser.java:42)

This HGVS string comes directly from the examples section of the HGVS docs.

I'm using version 0.34.

And looking at the API for NucleotideShortSequenceRepeatVariability, it looks like there's no way to get the repeated sequence. Is this intentional, or am I missing something?

@holtgrewe
Copy link
Member

Right now we only support a very limited number of repeats. From the grammar:

/** nucleotide short sequence repeat variability */
nt_change_ssr
:
	(
		nt_point_location
		| nt_range
	) NT_PAREN_OPEN NT_NUMBER NT_UNDERSCORE NT_NUMBER NT_PAREN_CLOSE
;

which means, e.g., <ref>:<spec>.<pos><seq>(<number>_<number>).

Adding this would require some expertise in Antlr development and knowledge of HGVS. It's not super hard work but right now I cannot invest the time into this. I'd welcome help with this, cf. #519.

@holtgrewe holtgrewe added enhancement help wanted Help is welcome for this. labels Jun 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement help wanted Help is welcome for this.
Projects
None yet
Development

No branches or pull requests

2 participants