forked from statOmics/SGA
-
Notifications
You must be signed in to change notification settings - Fork 0
/
dtuTutorial.Rmd
50 lines (35 loc) · 2.68 KB
/
dtuTutorial.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
---
title: Differential transcript usage analysis
author: "Lieven Clement"
date: "[statOmics](https://statomics.github.io), Ghent University"
output:
html_document:
theme: flatly
code_download: false
toc: false
toc_float: false
number_sections: false
bibliography: msqrob2.bib
---
<a rel="license" href="https://creativecommons.org/licenses/by-nc-sa/4.0"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-nc-sa/4.0/88x31.png" /></a>
### satuRn vignette
We developed satuRn for fast differential transcript usage analysis on bulk and single cell transcriptomics data.
Read and run the vignette of satuRn to learn howto work with satuRn
![satuRn vignette](https://www.bioconductor.org/packages/release/bioc/vignettes/satuRn/inst/doc/Vignette.html)
### Airway example
The data used in this workflow comes from an RNA-seq experiment where airway smooth muscle cells were treated with dexamethasone, a synthetic glucocorticoid steroid with anti-inflammatory effects (Himes et al. 2014). Glucocorticoids are used, for example, by people with asthma to reduce inflammation of the airways. In the experiment, four human airway smooth muscle cell lines were treated with 1 micromolar dexamethasone for 18 hours. For each of the four cell lines, we have a treated and an untreated sample. For more description of the experiment see the article, PubMed entry [PMID: 24926665](https://pubmed.ncbi.nlm.nih.gov/24926665/), and for raw data see the GEO entry [GSE52778](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE52778).
We will conduct a DTU analysis by starting from quantification using the fast alignment-free transcript mapper Salmon.
This can be done as follows:
- Building index with salmon
```
salmon index --gencode -t gencode.v32.transcripts.fa -i gencode.v32_salmon_index
```
- Mapping one sample with salmon:
```
salmon quant -i gencode.v32_salmon_index -l A --gcBias -1 SRR1039508_subset_1.fastq -2 SRR1039508_subset_2.fastq --validateMappings -o quant/SRR1039508_subset_quant
```
More details can be found in [Intro to salmon](https://combine-lab.github.io/salmon/getting_started/).
The mapped output from Salmon can be found at: [data](https://github.com/statOmics/SGA/archive/airwaySeqData.zip).
1. Import the transcript level counts with the tximeta package: see [tximeta](https://bioconductor.org/packages/release/bioc/vignettes/tximeta/inst/doc/tximeta.html#Running_tximeta) vignette on bioconductor
2. Run a DTU analysis on the transcript level counts with satuRn
Tip: I had to use the argument `importer=read.delim` in the tximeta function because the default function to read the files returned an error on my laptop.