Skip to content
Friederike Dündar edited this page Jan 19, 2016 · 21 revisions

For instructions on using the latest deepTools version, please go here. This page only applies to deepTools 1.5

<img src="https://raw.github.com/fidelram/deepTools/master/gallery/hm_TATApsem_thumb.png"/ Title="Heatmap of TATA scores around mouse gene TSS">

<img src="https://raw.github.com/fidelram/deepTools/master/gallery/hm_GC_thumb.png"/ Title="GC content for fly and imouse genes">

<img src="https://raw.github.com/fidelram/deepTools/master/gallery/hm_CpG_thumb.png"/ Title="GC content for fly and mouse genes">

<img src="https://raw.github.com/fidelram/deepTools/master/gallery/hm_histonesGomez_thumb.png"/ Title="Histones marks in Anopheles gambiae">

<img src="https://raw.github.com/fidelram/deepTools/master/gallery/hm_Bulut_thumb.png"/ Title="Repeat conservation scores and histone marks">

<img src="https://raw.github.com/fidelram/deepTools/master/gallery/coverage_Ibrahim_thumb.png"/ Title="ChIP-seq read coverages and peak calling results">


If you have a nice deepTools plot that you'd like to share, we'd be happy to ad it to our Gallery! Just send us an email: [email protected]


DNase accessibility at enhancers in murine ES cells

The following image demonstrates that enhancer regions are typically small stretches of highly accessible chromatin (more information on enhancers can be found, for example, here). In the heatmap, yellow and blue color tiles indicate large numbers of reads that were sequenced (which is indicative of open chromatin), black spots indicate missing data points. An appropriate labeling of the y-axis was neglected.

<img src="https://raw.github.com/fidelram/deepTools/master/gallery/hm_DNase.png"/ Title="Heatmap of TATA scores around mouse gene TSS" width="400">

Fast Facts
computeMatrix mode reference-point
regions file BED file with typical enhancer regions from Whyte et al., 2013 (download here)
signal file bigWig file with DNase signal from UCSC
heatmap cosmetics labels, titles, heatmap height

Command

$ deepTools-1.5.7/bin/computeMatrix reference-point \
 -S DNase_mouse.bigwig \
 -R Whyte_TypicalEnhancers_ESC.bed \
 --referencePoint center \
 -a 2000 -b 2000 \ ## regions before and after the enhancer centers
 -out matrix_Enhancers_DNase_ESC.tab.gz 

$ deepTools-1.5.7/bin/heatmapper \
 -m matrix_Enhancers_DNase_ESC.tab.gz\
 -out hm_DNase_ESC.png \
 --heatmapHeight 15  \
 --refPointLabel enh.center \
 --regionsLabel enhancers \
 --plotTitle 'DNase signal' \

go to top

TATA box enrichments around the TSS of mouse genes

Using the TRAP suite, we produced a bigWig file that contained TRAP scores for the well-known TATA box motif along the mouse genome. The TRAP score is a measure for the strength of a protein-DNA interaction at a given DNA sequence; the higher the score, the closer the motif is to the consensus motif sequence. The following heatmap demonstrates that:

  • TATA-like motifs occur quite frequently
  • there is an obvious clustering of TATA motifs slightly upstream of the TSS of many mouse genes
  • there are many genes that do not contain TATA-like motifs at their promoter

Note that the heatmap shows all mouse RefSeq genes, so ca. 15,000 genes!

<img src="https://raw.github.com/fidelram/deepTools/master/gallery/hm_TATApsem.png"/ Title="Heatmap of TATA scores around mouse gene TSS" width="400">

Fast Facts
computeMatrix mode reference-point
regions file BED file with all mouse genes (from UCSC table browser)
signal file bigWig file of TATA psem scores
heatmap cosmetics color scheme, labels, titles, heatmap height, only showing heatmap + colorbar
$ deepTools-1.5.7/bin/computeMatrix reference-point \
 -S TATA_01_pssm.bw \
 -R RefSeq_genes.bed \
 --referencePoint TSS \
 -a 100 -b 100 \
 --binSize 5 \

$ deepTools-1.5.7/bin/heatmapper \
 -m matrix_Genes_TATA.tab.gz  \
 -out hm_allGenes_TATA.png \
 --colorMap hot_r \
 --missingDataColor .4 \
 --heatmapHeight 7 \
 --plotTitle 'TATA motif' \
 --whatToShow 'heatmap and colorbar' \
 --sortRegions ascend

go to top

Visualizing the GC content for mouse and fly genes

It is well known that different species have different genome GC contents. Here, we used two bigWig files where the GC content was calculated for 50 bp windows along the genome of mice and flies and visualized the scores for gene regions. You can find the bigWig files in our Galaxy's data library.

The images nicely illustrate the completely opposite GC distributions in flies and mice: while the gene starts of mammalian genomes are enriched for CpGs, fly promoters show depletion of GC content.

<img src="https://raw.github.com/fidelram/deepTools/master/gallery/hm_GC.png"/ Title="Heatmaps of GCcontent for fly and mouse genes" width="400">

Fast Facts
computeMatrix mode scale-regions
regions files BED files with mouse and fly genes (from UCSC table browser)
signal file bigWig files with GC content
heatmap cosmetics color scheme, labels, titles, color for missing data was set to white, heatmap height

Fly and mouse genes were scaled to different sizes due to the different median sizes of the two species' genes (genes of D.melanogaster contain much fewer introns and are considerably shorter than mammalian genes). Thus, computeMatrix had to be run with slightly different parameters while the heatmapper commands were virtually identical (except for the labels).

$ deepTools-1.5.7/bin/computeMatrix scale-regions \
 -S GCcontent_Mm9_50_5.bw \
 -R RefSeq_genes_uniqNM.bed \
 -bs 50 
 -m 10000 -b 3000 -a 3000 \ 
 -out matrix_GCcont_Mm9_scaledGenes.tab.gz \
 --skipZeros \
 --missingDataAsZero  

$ deepTools-1.5.7/bin/computeMatrix scale-regions \
 -S GCcontent_Dm3_50_5.bw \
 -R Dm530.genes.bed \ 
 -bs 50
 -m 3000 -b 1000 -a 1000 \
 -out matrix_GCcont_Dm3_scaledGenes.tab.gz \
 --skipZeros --missingDataAsZero

$ deepTools-1.5.7/bin/heatmapper \
 -m matrix_GCcont_Dm3_scaledGenes.tab.gz \
 -out hm_GCcont_Dm3_scaledGenes.png \
 --colorMap YlGnBu \
 --regionsLabel 'fly genes' \
 --heatmapHeight 15 \
 --plotTitle 'GC content fly' &

$ deepTools-1.5.7/bin/heatmapper \
 -m matrix_GCcont_Mm9_scaledGenes.tab.gz \
 -out hm_GCcont_Mm9_scaledGenes.png \
 --colorMap YlGnBu \
 --regionsLabel 'mouse genes' \
 --heatmapHeight 15 \
 --plotTitle 'GC content mouse' &

go to top

CpG methylation around murine transcription start sites in two different cell types

In addition to the methylation of histone tails, the cytosine of DNA itself can also be methylated (for more information on CpG methylation, read here). In mammalian genomes, most CpGs are methylated except when they occur at gene promoters that need to be kept unmethylated to show full transcriptional activity. In the following heatmaps, we used genes that were determined to be expressed primarily in ES cells and checked the percentages of methylated cytosines around their transcription start sites. The blue signal indicates that very few methylated cytosines are found. When you compare the CpG methylation signal between ES cells and NP cells, you can see that the majority of genes remains unmethylated, but the general amount of CpG methylation around the TSSs increases as indicated by the stronger red signal and the slight elevation of the CpG methylation signal in the summary plot. This supports the notion that the genes stored in the BED file indeed tend to be more expressed in ES cells than in NP cells.

This image was taken from Chelmicki & Dündar et al. (2014), eLife.

<img src="https://raw.github.com/fidelram/deepTools/master/gallery/hm_CpG.png"/ Title="Heatmaps CpG methylation percentages around the TSS of ESC-active genes" width="400">

Fast Facts
computeMatrix mode reference-point
regions files BED file mouse genes expressed in ES cells
signal file bigWig files with fraction of methylated cytosins (from Stadler et al., 2011)
heatmap cosmetics color scheme, labels, titles, color for missing data was set to customized color, y-axis of profiles were changed, heatmap height

The commands for the bigWig files from the ES cell and NP cell sample were the same:

$ deepTools-1.5.7/bin/computeMatrix reference-point \
 -S GSE30202_ES_CpGmeth.bw \
 -R activeGenes_ESConly.bed \
 --referencePoint TSS \
 -a 2000 -b 2000 \
 -out matrix_Genes_ES_CpGmeth.tab.gz

$ deepTools-1.5.7/bin/heatmapper \
 -m matrix_Genes_ES_CpGmeth.tab.gz \
 -out hm_activeESCGenes_CpG_ES_indSort.png \
 --colorMap jet \
 --missingDataColor "#FFF6EB" \
 --heatmapHeight 15 \
 --yMin 0 --yMax 100 \
 --plotTitle 'ES cells' \
 --regionsLabel 'genes active in ESC' 

go to top

Histone marks for genes of the mosquito Anopheles gambiae

This figure was taken from Gómez-Díaz et al. (2014): Insights into the epigenomic landscape of the human malaria vector Anopheles gambiae. Fron Genet Aug15;5:277. It shows the distribution of H3K27me3 (left) and H3K27ac (right) with respect to gene features in A. gambiae midguts. The enrichment or depletion is shown relative to chromatin input. The regions in the map comprise gene bodies flanked by a segment of 200 bp at the 5′ end of TSSs and TTSs. Average profiles across gene regions ±200 bp for each histone modification are shown on top.

go to top

Signals of repressive chromatin marks, their enzymes and repeat element conservation scores

This image is from Bulut-Karsliogu and De La Rosa-Velázquez et al. (2014), Mol Cell. The heatmaps depict various signal types for unscaled peak regions of proteins and histone marks associated with repressed chromatin. The peaks were separated into those containing long interspersed elements (LINEs) on the forward and reverse strand. The signals include normalized ChIP-seq signals for H3K9me3, Suv39h1, Suv39h2, Eset, and HP1alpha-EGFP, followed by LINE and ERV content and repeat conservation scores.

go to top

Normalized ChIP-seq signals and peak regions

This image was published by Ibrahim et al., 2014 (NAR). They used deepTools to generate extended reads per kilobase per million reads at 10 bp resolution and visualized the resulting coverage files in IGV browser.

go to top


[read]: https://github.com/fidelram/deepTools/wiki/Glossary#terminology "the DNA piece that was actually sequenced ("read") by the sequencing machine (usually between 30 to 100 bp long, depending on the read-length of the sequencing protocol)" [input]: https://github.com/fidelram/deepTools/wiki/Glossary#terminology "confusing, albeit commonly used name for the 'no-antibody' control sample for ChIP experiments"