Ludwig Geistlinger, Marcel Ramos, Sehyun Oh, and Levi Waldron
CUNY School of Public Health 55 W 125th St, New York, NY 10027
[email protected]
[email protected]
[email protected]
[email protected]
This workshop gives an overview of Bioconductor solutions for the analysis of copy number variation (CNV) data. The workshop introduces Bioconductor core data structures for efficient representation, access, and manipulation of CNV data, and how to use these containers for structured downstream analysis of CNVs and integration with gene expression and quantitative phenotypes. Participants will be provided with code and hands-on practice for a comprehensive set of typical analysis steps including exploratory analysis, summarizing individual CNV calls across a population, overlap analysis with functional genomic regions and regulatory elements, expression quantitative trait loci (eQTL) analysis, and genome-wide association analysis (GWAS) with quantitative phenotypes. As an advanced application example, the workshop also introduces allele-specific absolute copy number analysis and how it is incorporated in cancer genomic analysis for the estimation of tumor characteristics such as tumor purity and ploidy.
-
Basic knowledge of R syntax
-
Familiarity with the SummarizedExperiment class
-
Familiarity with the GenomicRanges class
-
Familiarity with high-throughput genomic assays such as microarrays and next-generation sequencing
-
Familiarity with the biological definition of single nucleotide polymorphism (SNP) and copy number variation (CNV)
Execution of example code and hands-on practice
Activity | Time |
---|---|
Overview | 5m |
Data representation and manipulation | 20m |
Integrative downstream analysis (eQTL, GWAS, ...) | 20m |
Allele-specific CN analysis in cancer | 15m |
- Gain familiarity with elementary concepts of CNV analysis
- Learn how to efficiently represent, access, and manipulate CNV data in Bioconductor data structures
- Gain familiarity with different strategies for summarizing individual CNV calls across a population
- Learn how to assess the significance of overlaps between CNVs and functional genomic regions
- Learn how carry out association analysis with gene expression and quantitative phenotypes
- Gain familiarity with allele-specific absolute CN analysis of cancer genomic data
- Understand how CNVs can be experimentally detected and computationally inferred from SNP arrays and next-generation sequencing data
- Use
GRangesList
andRaggedExperiment
to represent, access, and manipulate CNV data - Identify recurrent CNV regions in a population, including density trimming, reciprocal overlap, and recurrence significance estimation
- Use the regioneR package to assess the significance of overlaps between CNVs and functional genomic regions such as genes, promoters, and enhancers.
- Carry out eQTL analysis for CNV and RNA-seq data using
GenomicRanges
/RaggedExperiment
/CNVRanger
architecture - Carry out a GWAS analysis for CNV and quantitative phenotype data
- Perform estimation of tumor purity and ploidy from absolute CN analysis with PureCN