Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CombineGVCFs incorrectly parses last GVCF block consisting of HLA contig #4572

Closed
sooheelee opened this issue Mar 23, 2018 · 10 comments
Closed

Comments

@sooheelee
Copy link
Contributor

sooheelee commented Mar 23, 2018

Data is sensitive and bug is recapitulated in https://github.com/broadinstitute/dsde-docs/issues/3026.

CombineGVCFs gives the following error message:

java.lang.IllegalArgumentException: Invalid interval. Contig:HLA-DRB1*15:03:01:02 start:11569 end:11005
	at org.broadinstitute.hellbender.utils.Utils.validateArg(Utils.java:687)
	at org.broadinstitute.hellbender.utils.SimpleInterval.validatePositions(SimpleInterval.java:61)
	at org.broadinstitute.hellbender.utils.SimpleInterval.<init>(SimpleInterval.java:37)
	at org.broadinstitute.hellbender.tools.walkers.CombineGVCFs.onTraversalSuccess(CombineGVCFs.java:415)
	at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:895)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:135)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:180)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:199)
	at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:159)
	at org.broadinstitute.hellbender.Main.mainEntry(Main.java:202)
	at org.broadinstitute.hellbender.Main.main(Main.java:288)

Here are the dictionary lines for two consecutive HLA-DRB1 contigs:

@SQ     SN:HLA-DRB1*15:03:01:02 LN:11569        M5:4e0d459b9bd15bff8645de84334e3d25     AS:38   UR:/seq/references/Homo_sapiens_assembly38/v0/Homo_sapiens_assembly38.fasta     SP:Homo sapiens
@SQ     SN:HLA-DRB1*16:02:01    LN:11005        M5:4a972df76bd3ee2857b87bd5be5ea00a     AS:38   UR:/seq/references/Homo_sapiens_assembly38/v0/Homo_sapiens_assembly38.fasta     SP:Homo sapiens

Notice the LN lengths match up. It appears that our tool is mistaking contig information.
Note that HLA-DRB1*16:02:01 is the very last contig in GRCh38.

@sooheelee
Copy link
Contributor Author

I will assign to @droazen to delegate.

@droazen
Copy link
Contributor

droazen commented Mar 23, 2018

@cmnbroad Could you have a look at this one when you get a chance?

@droazen droazen assigned cmnbroad and unassigned droazen Mar 23, 2018
@sooheelee
Copy link
Contributor Author

@droazen droazen added this to the Engine-2Q2018 milestone Mar 23, 2018
@cmnbroad
Copy link
Collaborator

@sooheelee I can see the original forum post, but I get a 404 when I click on the link at the top of this ticket (https://github.com/broadinstitute/dsde-docs/issues/3026).

@sooheelee
Copy link
Contributor Author

We need to ask @vdauwera to give you permission to view the dsde-docs repo.

@sooheelee
Copy link
Contributor Author

In the meanwhile, I've SLACKed you the location of the data @cmnbroad.

@cmnbroad
Copy link
Collaborator

Thanks @sooheelee. The code is definitely not correctly reconciling the accumulated variants/blocks that are remaining when traversal ends. I have a fix that resolves it, at least for this case, but this needs some more analysis and I'm going to have to write some additional tests to be sure that its correct.

@sooheelee
Copy link
Contributor Author

Great to hear there is a fix so soon @cmnbroad. Let me know if I can help out with creating test data.

@sooheelee
Copy link
Contributor Author

Fyi @cmnbroad, the researcher has clarified that this error occurs for other primary assembly contigs, e.g. chr1. See their clarification here.

@cmnbroad
Copy link
Collaborator

The fix for this issue is merged, but there are a couple of PRs for related issues (#4680 and #4681) that should be merged before we do a release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants