R package for automated pathway annotation in single-cell RNA-seq
You can install the package using devtools::install_github:
devtools::install_github("mariafiruleva/pwannot")
To start working, we must have a scaled and clustered scRNA-seq dataset and the set of pathways.
For demonstration of package functionality, we will use a dataset of 996 Peripheral Blood Mononuclear Cells (PBMCs) made publically available by 10X Genomics (data) and prepared by Seurat standard workflow v3.0.
library(pwannot)
pbmc <- readRDS(file.path(find.package('pwannot'),'data','pbmc_after_seurat.rds'))
genes_list <- readRDS(file.path(find.package('pwannot'),'data','KEGG_pathways.rds'))
Next run the analysis (time-consuming step). "10" is the minimal size of pathway, "500" is the maximal size of pathway, "0.05" is a significance level which define the number of success states in hypergeometric distribution, "10000" is a nubmer of random sample generations (10000 is recommended).
annotation_results <- pathways_annotation(pbmc, genes_list, 20, 500, 0.05, 10000)
Now we have a data frame with cluster as columns, pathways as rows with adjusted p-values.
For vizualisation the expression of target pathway (e.g., "LEE_DIFFERENTIATING_T_LYMPHOCYTE") distribution we use simply function:
plot_target_pw(pbmc, genes_list["LEE_DIFFERENTIATING_T_LYMPHOCYTE"], "tsne")+
ggtitle("LEE_DIFFERENTIATING_T_LYMPHOCYTE")