You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
With the GATK gCNV having great performance results on the first round of evaluations it is ready to be used to call on ExAC. The following things need to be done first:
Set up gCNV workflow to run on SGE (since exome samples are stored on prem)
Decide on target filtering strategy.
Decide on the number of samples to use to learn the model (PoN)
Get some truth data to do QC, for example CNV calls from Genome STRiP on matched genome samples in gnomAD
Design an interval list for samples in ExAC that do not mention one in their metadata. One possible solution could be to use cluster assignment of a sample to choose the interval list pertaining to that cluster
(Optional) Consider importing list of common CNV regions into gCNV. To make job of gCNV inference easier we could use the list of common CNV regions that was obtained from Genome STRiP calls.
To start @ldgauthier suggested using samples sequenced using latest Illumina capture protocol (Standard_Exome_Sequencing_v4) to get the ball rolling
The text was updated successfully, but these errors were encountered:
For QC, there will also be xHMM calls for most of the samples that were also in v1. Not as good as Genome STRiP/WGS but it should be a lot more samples. I can work on finding that data.
With respect to the last bullet, does gCNV already have this capability or would it involve modifying the code as well?
@ldgauthier Concordance with XHMM would definitely be useful for validating calls on clusters for which we do not have WGS data.
We need to modify code for having a fixed common CNV regions, but that should be straightforward.
With the GATK gCNV having great performance results on the first round of evaluations it is ready to be used to call on ExAC. The following things need to be done first:
Set up gCNV workflow to run on SGE (since exome samples are stored on prem)
Decide on target filtering strategy.
Decide on the number of samples to use to learn the model (PoN)
Get some truth data to do QC, for example CNV calls from Genome STRiP on matched genome samples in gnomAD
Design an interval list for samples in ExAC that do not mention one in their metadata. One possible solution could be to use cluster assignment of a sample to choose the interval list pertaining to that cluster
(Optional) Consider importing list of common CNV regions into gCNV. To make job of gCNV inference easier we could use the list of common CNV regions that was obtained from Genome STRiP calls.
To start @ldgauthier suggested using samples sequenced using latest Illumina capture protocol (Standard_Exome_Sequencing_v4) to get the ball rolling
The text was updated successfully, but these errors were encountered: