-
Notifications
You must be signed in to change notification settings - Fork 10
RNA Seq Data Analysis Tutorials
Nowadays, RNA-seq technology plays a pivotal role in characterizing the transcriptome in a given sample. Quantification of gene/transcript expression, identification of novel transcripts, and detection of fusion transcripts are the three major applications of RNA-Seq.
The RNA-seq data analysis can be grouped into three categories:
- Model-based approaches with both reference genome and transcriptome information.
- Semi-model approaches with only the reference genome information.
- Non-model approaches without reference genome and transcriptome information.
Accurate mapping of RNA-seq reads to the reference genome/transcriptome is the critical step for downstream analysis of transcript assembly, isoform detection, quantification and fusion detection. The running speed, sensitivity and specificity is the three essential metrics for the performance assessment.
- Hash-based
- Burrows-Wheeler-Transform (BWT)-based
- FM-index
- Graph FM-index
- Bowtie2
- TopHat2
- HISAT2
- Kallisto
- Salmon
- Sailfish
- SeqMap
- STAR
- BitSeq
- cufflinks
- htseq
- IsoEM
- Kallisto
- RSEM
- rSeq
- Sailfish
- Salmon
- STAR
- Stringtie
- eXpress
- Ballgown
- baySeq
- BitSeq
- cuffdiff
- DESeq2
- EBseq
- edgeR: Exact test
- limma+vst/voom transformation
- NBPseq
- NOISeqBIO
- SAMseq
- Sleuth
To detect the transcript-level fusion events, we should have a look at both the paired-end reads that aberrantly cross different genomic regions, and also the single reads that span the fusion junction.
- FastQC
- Trimmomatic
- Cutadapt
- tophat2 + cufflinks + cuffdiff
- hisat2 + Stringtie + ballgown
- bowtie2 + rsem + edgeR
- Kallisto + sleuth
- Homo_sapiens genome
- Homo_sapiens transcriptome
- Build index
- Gapped Alignment
- Estimation of the abundance
- Differential expression analysis
- Enrichment analysis
- GEO
- SRA
- Download the data:
prefetch -v SRR3126346
;ascp
$ASPERA/bin/ascp -i /root/.aspera/connect/etc/asperaweb_id_dsa.putty -pQTk1 -l 300m [email protected]:data/sracloud/srapub/SRR3126346 /root/ncbi/public/sra/SRR3126346.sra
- Conesa A. et al. A survey of best practices for RNA-seq data analysis. Genome Biol. 2016
- Oshlack A. et al. From RNA-seq reads to differential expression results. Genome Biol. 2010
- Dobin A1 and Gingeras TR. Mapping RNA-seq Reads with STAR. Curr Protoc Bioinformatics. 2015
- Shailesh Kumar et al. Comparative assessment of methods for the fusion transcripts detection from RNA-Seq data. Sci Rep. 2016
On the way to the garden of bioinformatics.
A bioinformatics wiki for the course BI462.