This R package allows users to visualize the single cell data on the R object or output files generated by Seurat. It is currently under active development.
plot1cell
R package can be easily installed from Github using devtools. Please make sure you have installed Seurat 4.0, circlize and ComplexHeatmap packages.
devtools::install_github("TheHumphreysLab/plot1cell")
## or the development version, devtools::install_github("HaojiaWu/plot1cell")
## You might need to install the dependencies below if they are not available in your R library.
bioc.packages <- c("biomaRt","GenomeInfoDb","EnsDb.Hsapiens.v86","GEOquery","simplifyEnrichment","ComplexHeatmap")
BiocManager::install(bioc.packages)
dev.packages <- c("chris-mcginnis-ucsf/DoubletFinder","Novartis/hdf5r","mojaveazure/loomR")
devtools::install_github(dev.packages)
## If you can't get the hdf5r package installed, please see the fix here:
## https://github.com/hhoeflin/hdf5r/issues/94
We provide some example codes to help generate figures from user's provided Seurat object. The Seurat object input to plot1cell
should be a final object with complete clustering and cell type annotation. If a seurat object is not available, we suggest to use the demo data from Satija's lab (https://satijalab.org/seurat/articles/integration_introduction.html). To demonstrate the plotting functions in plot1cell, we re-created a Seurat object from our recent paper Kirita et al, PNAS 2020 by integrating the count matrices we uploaded to GEO (GSE139107).
library(plot1cell)
iri.integrated <- Install.example()
# Please note that this Seurat object is just for demo purpose and
# is not exactly the same as the one we published on PNAS.
# It takes about 2 hours to run in a linux server with 500GB RAM and 32 CPU cores.
# You can skip this step and use your own Seurat object instead
This circlize plot was inspired by the data visualization in a published paper (Figure1, https://www.nature.com/articles/s41586-021-03775-x) from Linnarsson's lab.
###Check and see the meta data info on your Seurat object
colnames(iri.integrated@meta.data)
###Prepare data for ploting
circ_data <- prepare_circlize_data(iri.integrated, scale = 0.8 )
set.seed(1234)
cluster_colors<-rand_color(length(levels(iri.integrated)))
group_colors<-rand_color(length(names(table(iri.integrated$Group))))
rep_colors<-rand_color(length(names(table(iri.integrated$orig.ident))))
###plot and save figures
png(filename = 'circlize_plot.png', width = 6, height = 6,units = 'in', res = 300)
plot_circlize(circ_data,do.label = T, pt.size = 0.01, col.use = cluster_colors ,bg.color = 'white', kde2d.n = 200, repel = T, label.cex = 0.6)
add_track(circ_data, group = "Group", colors = group_colors, track_num = 2) ## can change it to one of the columns in the meta data of your seurat object
add_track(circ_data, group = "orig.ident",colors = rep_colors, track_num = 3) ## can change it to one of the columns in the meta data of your seurat object
dev.off()
Here is an example to use plot1cell to show one gene expression across different cell types in different groups.
png(filename = 'dotplot_single.png', width = 4, height = 6,units = 'in', res = 100)
complex_dotplot_single(seu_obj = iri.integrated, feature = "Havcr1",groups = "Group")
dev.off()
If the group factor can be classified by another factor, complex_dotplot_single
allows splitting the group factor by another group factor too. Here is an example for demo.
iri.integrated@meta.data$Phase<-plyr::mapvalues(iri.integrated@meta.data$Group, from = levels(iri.integrated@meta.data$Group), to = c("Healthy",rep("Injury",3), rep("Recovery",2)))
iri.integrated@meta.data$Phase<-as.character(iri.integrated@meta.data$Phase)
png(filename = 'dotplot_single_split.png', width = 4, height = 6,units = 'in', res = 100)
complex_dotplot_single(iri.integrated, feature = "Havcr1",groups = "Group",splitby = "Phase")
dev.off()
To visualize the same gene on multiple group factors, simply add more group factor IDs to the groups
argument.
png(filename = 'dotplot_more_groups.png', width = 8, height = 6,units = 'in', res = 100)
complex_dotplot_single(seu_obj = iri.integrated, feature = "Havcr1",groups= c("Group","Replicates"))
dev.off()
Each group factor can be further splitted by its own factor if the splitby
argument is provided. Note that in this case, the order of the group factors needs to match the order of splitby factors.
iri.integrated@meta.data$ReplicateID<-plyr::mapvalues(iri.integrated@meta.data$Replicates, from = names(table((iri.integrated@meta.data$Replicates))), to = c(rep("Rep1",3),rep("Rep2",3), rep("Rep3",1)))
iri.integrated@meta.data$ReplicateID<-as.character(iri.integrated@meta.data$ReplicateID)
png(filename = 'dotplot_more_groups_split.png', width = 9, height = 6,units = 'in', res = 200)
complex_dotplot_single(seu_obj = iri.integrated, feature = "Havcr1",groups= c("Group","Replicates"), splitby = c("Phase","ReplicateID"))
dev.off()
### In this example, "Phase" is a splitby factor for "Group" and "ReplicateID" is a splitby factor for "Replicates".
Note that the Replicates group here is just for showcase purpose. This is not a meaningful group ID in our snRNA-seq dataset.
To visualize multiple genes in dotplot format, complex_dotplot_multiple
should be used.
png(filename = 'dotplot_multiple.png', width = 10, height = 4,units = 'in', res = 300)
complex_dotplot_multiple(seu_obj = iri.integrated, features = c("Slc34a1","Slc7a13","Havcr1","Krt20","Vcam1"),group = "Group", celltypes = c("PTS1" , "PTS2" , "PTS3" , "NewPT1" , "NewPT2"))
dev.off()
png(filename = 'vlnplot_single.png', width = 4, height = 6,units = 'in', res = 100)
complex_vlnplot_single(iri.integrated, feature = "Havcr1", groups = "Group",celltypes = c("PTS1" , "PTS2" , "PTS3" , "NewPT1" , "NewPT2"))
dev.off()
Similar to complex_dotplot_single, the complex_vlnplot_single function also allows splitting the group factor by another factor with the argument splitby
.
png(filename = 'vlnplot_single_split.png', width = 4, height = 6,units = 'in', res = 100)
complex_vlnplot_single(iri.integrated, feature = "Havcr1", groups = "Group",celltypes = c("PTS1" , "PTS2" , "PTS3" , "NewPT1" , "NewPT2"), splitby = "Phase")
dev.off()
png(filename = 'vlnplot_multiple.png', width = 6, height = 6,units = 'in', res = 100)
complex_vlnplot_single(iri.integrated, feature = "Havcr1", groups = c("Group","Replicates"),celltypes = c("PTS1" , "PTS2" , "PTS3" , "NewPT1" , "NewPT2"), font.size = 10)
dev.off()
Similar to the functionality in complex_dotplot, each group factor can also be splitted by another factor in violin plot. For example:
png(filename = 'vlnplot_multiple_split.png', width = 7, height = 5,units = 'in', res = 200)
complex_vlnplot_single(iri.integrated, feature = "Havcr1", groups = c("Group","Replicates"),
celltypes = c("PTS1" , "PTS2" , "PTS3" , "NewPT1" , "NewPT2"),
font.size = 10, splitby = c("Phase","ReplicateID"), pt.size=0.05)
dev.off()
png(filename = 'vlnplot_multiple_genes.png', width = 6, height = 6,units = 'in', res = 300)
complex_vlnplot_multiple(iri.integrated, features = c("Havcr1", "Slc34a1", "Vcam1", "Krt20" , "Slc7a13", "Slc5a12"), celltypes = c("PTS1" , "PTS2" , "PTS3" , "NewPT1" , "NewPT2"), group = "Group", add.dot=T, pt.size=0.01, alpha=0.01, font.size = 10)
dev.off()
The violin plot will look too messy in this scenario so it is not included in plot1cell.
png(filename = 'data/geneplot_umap.png', width = 8, height = 6,units = 'in', res = 100)
complex_featureplot(iri.integrated, features = c("Havcr1", "Slc34a1", "Vcam1", "Krt20" , "Slc7a13"), group = "Group", select = c("Control","12hours","6weeks"), order = F)
dev.off()
plot1cell can directly identify the condition specific genes in a selected cell type and plot those genes using ComplexHeatmap. An example is shown below:
iri.integrated$Group2<-plyr::mapvalues(iri.integrated$Group, from = c("Control", "4hours", "12hours", "2days", "14days" , "6weeks" ),
to = c("Ctrl","Hr4","Hr12","Day2", "Day14","Wk6"))
iri.integrated$Group2<-factor(iri.integrated$Group2, levels = c("Ctrl","Hr4","Hr12","Day2", "Day14","Wk6"))
png(filename = 'heatmap_group.png', width = 4, height = 8,units = 'in', res = 100)
complex_heatmap_unique(seu_obj = iri.integrated, celltype = "NewPT2", group = "Group2",gene_highlight = c("Slc22a28","Vcam1","Krt20","Havcr1"))
dev.off()
png(filename = 'upset_plot.png', width = 8, height = 4,units = 'in', res = 300)
complex_upset_plot(iri.integrated, celltype = "NewPT2", group = "Group", min_size = 10, logfc=0.5)
dev.off()
png(filename = 'cell_fraction.png', width = 8, height = 4,units = 'in', res = 300)
plot_cell_fraction(iri.integrated, celltypes = c("PTS1" , "PTS2" , "PTS3" , "NewPT1" , "NewPT2"), groupby = "Group", show_replicate = T, rep_colname = "orig.ident")
dev.off()
There are other functions for plotting/data processing in plot1cell.
help(package = plot1cell)
Many more functions will be added in the future package development. For questions, please raise an issue in this github page or contact TheHumphreysLab.
This package uses many methods from Seurat (https://github.com/satijalab/seurat) to process the data for ploting. The circlize and heatmap plots were generated by the circlize (https://github.com/jokergoo/circlize) and ComplexHeatmap (https://github.com/jokergoo/ComplexHeatmap) packages. The Upset plot was generated by the ComplexUpset package (https://github.com/krassowski/complex-upset). Most of other graphs were generated using ggplot2 (https://github.com/tidyverse/ggplot2). The package benefits from the following dependencies.
Seurat,
plotly,
circlize,
dplyr,
ggplot2,
ggh4x,
MASS,
scales,
progress,
RColorBrewer,
grid,
grDevices,
biomaRt,
reshape2,
ggbeeswarm,
purrr,
ComplexUpset,
matrixStats,
DoubletFinder,
methods,
data.table,
Matrix,
hdf5r,
loomR,
GenomeInfoDb,
EnsDb.Hsapiens.v86,
cowplot,
rlang,
GEOquery,
simplifyEnrichment,
wordcloud,
ComplexHeatmap
Please consider citing our paper if you find plot1cell
useful.
https://www.cell.com/cell-metabolism/fulltext/S1550-4131(22)00192-9
Cell Metab. 2022 Jul 5;34(7):1064-1078.e6.
Wu H, Gonzalez Villalobos R, Yao X, Reilly D, Chen T, Rankin M, Myshkin E, Breyer MD, Humphreys BD.
Mapping the single-cell transcriptomic response of murine diabetic kidney disease to therapies.