Skip to content
This repository has been archived by the owner on Jan 24, 2024. It is now read-only.

Commit

Permalink
Adding explicit support for SV/CNV calling tools (#68)
Browse files Browse the repository at this point in the history
Closes: #68
Closes: #60
Related-Issue: #68
Projected-Results-Impact: none
  • Loading branch information
holtgrewe committed Sep 14, 2022
1 parent 753aea5 commit 181275e
Show file tree
Hide file tree
Showing 42 changed files with 2,306 additions and 179 deletions.
67 changes: 44 additions & 23 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,9 +43,26 @@ The following fields are considered:

### Structural Variants / Copy Number Variants

Note that if the `INFO/SVMETHOD` field is missing then you should define `--default-sv-method` as you would otherwise get a problem downstream.
**Supported Callers and Caller Annotation**

The following variant callers are explicitely supported.

- Delly 2 (SVs)
- Dragen CNV caller
- Dragen SV caller
- Manta
- GATK gCNV
- XHMM (deprecated)

In the other cases, VarFish annotator will fall back to a "generic" import where only the per-sample fields `GT`, `FT`, and `GQ` are interpreted.
Your caller should also write out `INFO/END`, `INFO/SVTYPE`, and `INFO/SVLEN` as defined by VCF4.2

VarFish Annotator will look at the field `INFO/SVMETHOD` to annotate calls with the caller where the call originated from.
If this field is empty then you should define `--default-sv-method` so you get appropriately labeled output.
If you have any problem with your data then please tell us by opening a GitHub issue.

**Interpretation of top-level and INFO VCF fields**

The following fields are considered:

- `CHROM`
Expand All @@ -70,28 +87,32 @@ The following fields are considered:
Confidence interval around the end point of the SV.
- `INFO/SVMETHOD`
The name of the caller that was used.
- `FORMAT` and per `SAMPLE`
- Common
- `GT` Genotype
- `FT` Per-genotype filter values
- `GQ` Phred-scaled genotype quality
- For Delly2
- `DR` Reference pairs
- `DV` Variant pairs
- `RR` Reference junction count
- `RV` Variant junction count
- For XHMM
- `DQ` Diploid Quality
- `NDQ` Non-diploid Quality
- `RD` Mean normalized read depth over region
- `PL` Genotype likelihoods for [diploid, deletion, duplication]
- For GATK gCNV
- `CN` Copy number
- `NP` Number of points in segment
- `QA` Phred-scale quality of all points agreeing
- `QS` Phred-scaled quality of least one point agreeing
- `QSS` Phred-scaled quality of start breakpoint
- `QSE` Phred-scaled quality of end breakpoint

**Interpretation of `FORMAT` and per sample fields**

- Common
- `GT` Genotype, written as `gt`
- `FT` Per-genotype filter values, written as `ft`
- `GQ` Phred-scaled genotype quality, written as `gq`
- Delly2
- `DR` Reference pairs, written as `pec = DR + DV`
- `DV` Variant pairs, written as `pev`
- `RR` Reference junction count, written as `src = RR + RV`
- `RV` Variant junction count, written as `srv`
- `RDCN` Copy number estimate, written as `cn`
- Dragen CNV
- `SM` Average normalized overage, written as `anc`
- `BC` Bucket count, written as point count `pc`
- `PE` Discordante read count at start/end, written as `pev = PE[0] + PE[1]`
- Dragen SV
- `PR` Paired read of reference and variant, written as `pec = PR[0] + PR[1]` and `pev = PR[1]`
- `SR` Paired read of reference and variant, written as `src = SR[0] + SR[1]` and `srv = SR[1]`
- For GATK gCNV
- `CN` Integer copy number, written as `cn`
- `NP` Number of points in segment, written as `np`
- Manta (equivalent to Dragen SV)
- For XHMM
- `RD` Average normalized coveage, written as `an`

## Example

Expand Down
2 changes: 1 addition & 1 deletion tests/hg19-chr22/Case_1_index.delly2.gts.tsv-expected
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
release chromosome chromosome_no bin chromosome2 chromosome_no2 bin2 pe_orientation start end start_ci_left start_ci_right end_ci_left end_ci_right case_id set_id sv_uuid caller sv_type sv_sub_type info num_hom_alt num_hom_ref num_het num_hemi_alt num_hemi_ref genotype
GRCh37 22 22 89 22 22 89 3to5 17400000 17700000 -29 29 -29 29 . . UUID EMBL.DELLYv1.1.3 DEL DEL {"""backgroundCarriers""":0,"""affectedCarriers""":0,"""unaffectedCarriers""":0} 0 2 1 0 0 {"""Case_1_father-N1-DNA1-WGS1""":{"""gt""":"""0/1""","""gq""":14,"""pec""":0,"""pev""":0,"""src""":34,"""srv""":4},"""Case_1_index-N1-DNA1-WGS1""":{"""gt""":"""0/1""","""gq""":14,"""pec""":0,"""pev""":0,"""src""":34,"""srv""":4,"""gt""":"""0/0""","""gq""":35,"""pec""":0,"""pev""":0,"""src""":29,"""srv""":2},"""Case_1_mother-N1-DNA1-WGS1""":{"""gt""":"""0/1""","""gq""":14,"""pec""":0,"""pev""":0,"""src""":34,"""srv""":4,"""gt""":"""0/0""","""gq""":35,"""pec""":0,"""pev""":0,"""src""":29,"""srv""":2,"""gt""":"""0/0""","""gq""":67,"""pec""":0,"""pev""":0,"""src""":32,"""srv""":1}}
GRCh37 22 22 89 22 22 89 3to5 17400000 17700000 -29 29 -29 29 . . UUID EMBL.DELLYv1.1.3 DEL DEL {"""backgroundCarriers""":0,"""affectedCarriers""":0,"""unaffectedCarriers""":0} 0 2 1 0 0 {"""Case_1_father-N1-DNA1-WGS1""":{"""gt""":"""0/1""","""ft""":{"""LowQual"""},"""gq""":14,"""pec""":0,"""pev""":0,"""src""":34,"""srv""":4,"""cn""":2},"""Case_1_index-N1-DNA1-WGS1""":{"""gt""":"""0/0""","""gq""":35,"""pec""":0,"""pev""":0,"""src""":29,"""srv""":2,"""cn""":2},"""Case_1_mother-N1-DNA1-WGS1""":{"""gt""":"""0/0""","""gq""":67,"""pec""":0,"""pev""":0,"""src""":32,"""srv""":1,"""cn""":2}}
Original file line number Diff line number Diff line change
Expand Up @@ -67,10 +67,14 @@ public final class AnnotateSvsVcf {
/** Pedigree to use for annotation. */
private Pedigree pedigree;

/** Helper to use for creating genotypes and feature effects files. */
private CallerSupport callerSupport;

/** Construct with the given configuration. */
public AnnotateSvsVcf(AnnotateSvsArgs args) {
this.args = args;
this.pedigree = null;
this.callerSupport = CallerSupportFactory.getFor(new File(args.getInputVcf()));
}

/** UUID counter for sequential UUID generation. */
Expand Down Expand Up @@ -236,7 +240,8 @@ private void annotateSvVcf(
args.getOptOutFeatures(),
args.getCaseId(),
args.getSetId(),
pedigree);
pedigree,
callerSupport);
final FeatureEffectsRecordBuilder feRecordBuilder =
new FeatureEffectsRecordBuilder(args.getCaseId(), args.getSetId());

Expand Down
Loading

0 comments on commit 181275e

Please sign in to comment.