forked from emdann/cellatacUtils
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.Rmd
66 lines (53 loc) · 2.1 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
---
title: "cellatacUtils"
output: github_document
---
```{r, echo = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
cache = FALSE,
message = FALSE,
warning = FALSE
)
```
R package for handling [cellatac pipeline](https://github.com/cellgeni/cellatac) output
## Installation
```{r, eval=FALSE, echo=TRUE}
devtools::install_github("emdann/cellatacUtils")
```
## Example usage
Load pipeline output as Seurat object
```{r, eval=FALSE, echo=TRUE}
library(cellatacUtils)
results.dir <- "path/to/results"
cellatac.seu <- load_cellatac_seurat(results.dir)
```
Calculate QC metrics for peaks
```{r, eval=FALSE, echo=TRUE}
## Load Ensembl based annotation from Bioconductor
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("EnsDb.Hsapiens.v86")
library(EnsDb.Hsapiens.v86)
## Load Signac for ENCODE blacklist annotation
library(Signac)
cellatac.seu <- compute_peakQC(cellatac.seu,
cluster_col = "seurat_clusters",
EnsDb.genome = EnsDb.Hsapiens.v86,
blacklist_gr = blacklist_hg38)
```
Calculated peak QC metrics:
* tot_count: total count of fragments overlapping peak
* tot_cells: total number of cells in which peak is accessible
* frac_cells: fraction of cells in which peak is accessible
* max_frac: maximum percent of cells with accessible peak in any given cluster (to filter out peaks that are not observed to be accessible in at least n% cells of cells in at least one cluster)
* peak_width: width of peak in bps
* exon: is peak overlapping exon?
* intron: is peak overlapping intron?
* promoter: is peak overlapping promoter?
* annotation: annotation of overlapping genomic region (promoter, exon, intron, intergenic)
* gene_name: name of overlapping gene (downstream gene for promoter region, NA for intergenic peaks)
* gene_id: ENSEMBL id of overlapping gene (downstream gene for promoter region, NA for intergenic peaks)
* tss_distance: distance to closest Transcription Start Site in bps
* ENCODE_blacklist: is peak overlapping ENCODE blacklist region?