Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calling ExAC with GATK gCNV #4738

Closed
asmirnov239 opened this issue May 7, 2018 · 3 comments
Closed

Calling ExAC with GATK gCNV #4738

asmirnov239 opened this issue May 7, 2018 · 3 comments

Comments

@asmirnov239
Copy link
Collaborator

asmirnov239 commented May 7, 2018

With the GATK gCNV having great performance results on the first round of evaluations it is ready to be used to call on ExAC. The following things need to be done first:

  • Set up gCNV workflow to run on SGE (since exome samples are stored on prem)

  • Decide on target filtering strategy.

  • Decide on the number of samples to use to learn the model (PoN)

  • Get some truth data to do QC, for example CNV calls from Genome STRiP on matched genome samples in gnomAD

  • Design an interval list for samples in ExAC that do not mention one in their metadata. One possible solution could be to use cluster assignment of a sample to choose the interval list pertaining to that cluster

  • (Optional) Consider importing list of common CNV regions into gCNV. To make job of gCNV inference easier we could use the list of common CNV regions that was obtained from Genome STRiP calls.

To start @ldgauthier suggested using samples sequenced using latest Illumina capture protocol (Standard_Exome_Sequencing_v4) to get the ball rolling

@ldgauthier
Copy link
Contributor

For QC, there will also be xHMM calls for most of the samples that were also in v1. Not as good as Genome STRiP/WGS but it should be a lot more samples. I can work on finding that data.

With respect to the last bullet, does gCNV already have this capability or would it involve modifying the code as well?

@sooheelee
Copy link
Contributor

I'm starting gCNV tutorial development so I have an interest in knowing how researchers are using the tool.

@asmirnov239
Copy link
Collaborator Author

@ldgauthier Concordance with XHMM would definitely be useful for validating calls on clusters for which we do not have WGS data.
We need to modify code for having a fixed common CNV regions, but that should be straightforward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants