IntensityRankSumAnnotator tool is used to perform in-silico validation of Copy Number Variants (CNVs) in the UKBB dataset using SNP array intensity data.
- A Google Cloud account.
- A workflow execution system supporting the Workflow Description Language (WDL), such as:
- Cromwell (v36 or higher). A dedicated server is highly recommended.
- cromshell for interacting with a dedicated Cromwell server.
- Directory path with bed files containing UKBB gCNV output (per chromosome).
- VCF header template
- List of UKBB SNP array files in VCF format:
- List of samples on which to run GenomeStrip IRS
The main scripts to run this analysis are:
ukbbValidation.wdl
: this workflow reformats SNP array and gCNV data from the UKBB and calls GenomeStrip IRS for in-silico CNV validation.genomeStripIRS.wdl
: runs GenomeStrip IRS and can be executed on its own.
> git clone https://github.com/talkowski-lab/cnv-validation.git
> cd cnv-validation/wdl
> zip dependencies.zip *
> cromshell submit ukbbValidation.wdl /path/to/array-validation.json /path/to/config.json dependencies.zip
Copyright (c) 2022 Talkowski Lab and The Broad Institute of M.I.T. and Harvard
Contact: Alba Sanchis-Juan
SV aggregation team: Ryan Collins, Jack Fu, Isaac Wong, Alba Sanchis-Juan and Harrison Brand on behalf of the Talkowski Laboratory