Single cell biological information analysis process
The types, states, and interactions of cells in human tissues vary greatly. Single-cell transcriptome sequencing (scRNA-seq) is a new technique for high-throughput sequencing analysis of the transcriptome in single cell. Single-cell transcriptome sequencing can complement conventional transcriptome sequencing (mRNA-seq: Bulk RNA sequencing, comparing the average expression values of genes in all cells of the cell population), revealing the expression situation of all genes in the all-cause group in single cell, including the identified tissue cell types, reflecting the cell heterogeneity between different samples and the tissue microenvironment, so that we can better understand the real state and correlation of each cell in a Bulk tissue. Presents a real and comprehensive cellular world. Currently, single-cell transcriptome sequencing is mostly used in complex multicellular systems such as tumor, developmental, neural, and immune microenvironments.
The purpose of this tool is connect the analysis of single-cell data into a complete process to accelerate the speed of analysis and contribute to the progress in this field.
Tutorials link (document and video): https://github.com/OpenGene/scrnapip/tree/main/tutorials.
docker pull zhangjing12/scrnapip
#Human reference (GRCh38) dataset required for Cell Ranger.
wget https://cf.10xgenomics.com/supp/cell-exp/refdata-gex-GRCh38-2020-A.tar.gz
#Mouse reference dataset required for Cell Ranger.
wget https://cf.10xgenomics.com/supp/cell-exp/refdata-gex-mm10-2020-A.tar.gz
docker run -d -p 1921:8787 -p 1882:3838 -e PASSWORD=yourpassword -e USERID=youruserid -e GROUPID=yourgroupid -v /yourdatapath:/dockerpath zhangjing12/scrnapip
The image is created based on Rocker (https://rocker-project.org/images/versioned/rstudio.html). You can use the above command to access rstudio through port 8787, which is more convenient for users to use the process. The userid and groupid can be queried through the id command. For the port number, please confirm whether the corresponding port is open.
All input files and parameters are set in this configuration file. The main settings that need to be changed are the followingοΌ
#####[fastp_cellrange]: RAW data path. The pair end data must be split into two files
S1.R1=["/usr/data/SAMPLE1_S1_L001_R1_001.fastq.gz"]
S1.R2=["/usr/data/SAMPLE1_S1_L001_R2_001.fastq.gz"]
#If a sample has more than one raw data, you can merge them before or add path split by ",":
S1.R1=["/usr/data/SAMPLE1.1_S1_L001_R1_001.fastq.gz","/usr/data/SAMPLE1.2_S1_L001_R1_001.fastq.gz"]
S1.R2=["/usr/data/SAMPLE1.1_S1_L001_R2_001.fastq.gz","/usr/data/SAMPLE1.2_S1_L001_R2_001.fastq.gz"]
#####[indata]: cellranger matrix file path
S1="/usr/workout/02.cellranger/S1/outs/filtered_feature_bc_matrix"
#####[outpath]: output path
outpath="/usr/workout"
#####[tempdata]rds file output path
tempdata="workout"
#####[run]: The analysis that needs to be done should set to true, for example:
fastp=true #run fastp
#####[fastp]: Configure the fastp path and parameters
fastppath="/home/bin/fastp"
longr=26 #R1 length after trim
ncode=5 #The maximum number of N-bases
#####[cellrangle]: Configure the cellranger path and parameters
dockerusr="1025:1025" #user id
dir="/user/name" #The folder which docker mount
ref="/user/refdata-gex-GRCh38-2020-A" #Reference genome path
cellrangerpath="/home/bin/cellranger-7.1.0/cellranger" #software path of cellranger
expectcell=10000 #expect cell number
localcores=32 #Number of threads
localmem=64 #Memory size
include_introns="false" #Whether to analyze introns
#####[step1]:
filetype="10x" #The format of the input file,could be "10x" or "csv"
csv_sep="" #Separator of the csv file
nFeature_RNA=[200,5000] #The cells were filtered by feature, keeping cells that feature between 200 and 5000
percent_mt=[0,10] #The cells were filtered by percent of mitochondria, keeping cells that percent of mitochondria less than 10%
mttype="MT" #Mitochondrial type, MT for humans and mt for mice
#####[step2]:
kfilter=200 #Minimum number of cells per sample
normethod="SCT" #The merge method, which uses SCT by default, can also use vst to simply group samples together
nFeature=3000 #Genes for subsequent analysis
#####[step3]:
recluster=true #Whether to perform batch correction
mode="harmony" #Method of batch correction
heatmapnumber=9 #Number of heatmaps drawn for pca
elbowdims=100 #The number of PCS shown in the elbow diagram
dims=30 #Select the top 30 PCs for dimensionality reduction
reduction="umap" #tSNE or UMAP
clustercell=true #Whether you need to cluster cells
resolution=0.6 #Set the resolution when clustering
algorithm=1 #Cluster modular optimization algorithm (1 = original Louvain algorithm; 2 = Louvain algorithm with multilevel refinement; 3 = SLM algorithm)
singler="/home/bin/singleRdata/singleRdata/test.rds" #singleR database position
#####[step4]:
clustermarkers=true #Whether marker genes of each cluster need to be found
min_pct=0.25 #The minimum proportion of marker gene in the number of cells is 0.25 by default
findmarkers_testuse="wilcox" #The method of finding marker gene
difcluster.test.a=[0,1] #Find Differential gene.If you want to find differences between samples,change cluster to ident
difcluster.test.b=[5,6] #Test indicates the group name, a for case and b for control
difcluster.test.testuse="wilcox" #Inspection method
ClusterProfiler=["true","Rscript","/home/bin/clusterProfiler.R","-a true -s org.Hs.eg.db,hsa,human -g 6 -t SYMBOL -d KEGG,BioCyc,PID,PANTHER,BIOCARTA -C 0.05"] #Enrichment analysis of difference analysis results. -a: Whether to use all background genes; -s: species; -g: The column of the gene in the file; -t: gene name type(SYMBOL,ENTREZID); -d: database name
#####[step5]:
meanexpression=0.5 #Select the appropriate gene to mark the state, intercept the condition, default is 0.5
genenum=50 #Number of gene in differential analysis heat map
numclusters=4 #The number of clusters in a cluster
pointid=1 #The branching points used in BEAM analysis
BEAMnumclusters=4 #Number of clusters in heat map clustering
BEAMgn=50 #BEAM analyzes heat map gene count
BEAMgenelist=["S100A12","ALOX5AP","PAD14","NRG1","MCEMP1","THBS1"] #BEAM analyzes specific gene names
#####[step6]:
circosbin="/home/bin/get_exp.r" #Extraction expression
circos_perl_bin="/home/bin/circos_plot.pl" #Plot circos
#####[step7]:
copykat_bin="/home/bin/copykat_v4.r" #Identify tumor cells
#####[step8]:
cytoTRACE_bin="/home/bin/cytotrace_230508.R" #Developmental potential analysis
#####[step9]:
genomicinstably_bin="/home/bin/genomicinstably.R" #Genomic instability analysist
org="human" #species
#####[step11]:
ClusterProfiler=["true","Rscript","/home/bin/clusterProfiler.R","-a true -s org.Hs.eg.db,hsa,human -g 1 -t SYMBOL -d KEGG,BioCyc,PID,PANTHER,BIOCARTA -C 0.05"] #Enrichment analysis of marker gene
This R script is used for data filtering and comparison quantitative analysis, and relevant parameters are set in the configuration file config_Example.ini.
Rscript /home/bin/fastp_cellranger.r -i config_Example.ini
This R script is used for all advanced analyses, and relevant parameters are set in the configuration file config_Example.ini.
Rscript /home/bin/singlecell.r -i config_Example.ini
Sequence statistics and reads filtering result files were performed on the original data.
ββ 01.Fastp/
βββ <SampleName>/ <- config for report
βββ <SampleName>_fastp.html <- Report generated by fastp
βββ <SampleName>_fastp.json <- Statistical information generated by fastp
βββ <SampleName>_S1_L001_R1_001.fastq.gz <- R1 clean data
βββ <SampleName>_S1_L001_R2_001.fastq.gz <- R2 clean data
Quality control summary statistics by fastp.
Cellranger results after mapping and quantitative.
ββ 02.Cellranger/
βββ <SampleName>/
βββ outs/ <- Cellranger analysis results
βββ analysis/ <- Cluster by cellranger
βββ raw_feature_bc_matrix/ <- Unfiltered feature-barcode matrices MEX (usually not used)
βββ raw_feature_bc_matrix.h5 <- Unfiltered feature-barcode matrices HDF5
βββ filtered_feature_bc_matrix/ <- Filtered feature-barcode matrices MEX
βββ filtered_feature_bc_matrix.h5 <- Filtered feature-barcode matrices HDF5
βββ molecule_info.h5 <- Per-molecule read information
βββ metrics_summary.xls <- Run summary information
βββ possorted_genome_bam.bam <- Bam file of single cell alignment
βββ possorted_genome_bam.bam.bai <- Bam index
βββ web_summary.html <- Run summary HTML
Cellranger web report.
Filter low-quality cells according to mitochondrial proportion and gene number.
ββ 03.CellFilter/
βββ <SampleName>/
β βββ <SampleName>_countVfeature.png/pdf <- Scatter plot of nCounts and nFeature
β βββ <SampleName>_countVmt.png/pdf <- Scatter plot of nCounts and percent.mito
β βββ <SampleName>_libraryVmt.png/pdf <- Scatter plot of nCounts and percent.mito,nCounts as the color of the dot
β βββ <SampleName>_voilin.png/pdf <- Violin plot with dots of nCounts,nFeature and percent.mito
βββ summary.txt/ <- Summary statistical information of cell filtration
Scatter plot of feature and UMIs for all cells.Filter the cells outside the two red lines.
Scatter plot of percent mitochondria and UMIs for all cells.Filter the cells above the red lines.
Dimensionality reduction and clustering to filtered cells.Annotate cells with singleR.
ββ 04.PCA_UMAP/
βββ pca/
β βββ bowplot.png/pdf <- Standard Deviation of top 50 PC
β βββ pca.png/pdf <- Sample distribution of pc1 and pc2
β βββ pcaheatmap1.png/pdf <- Heatmap of genes in a pc
βββ umap/
βββ barplot.png/pdf <- The proportion of each sample in each cluster
βββ celltype.png/pdf <- The cell type identified by singleR
βββ cluster0.png/pdf <- Pie charts of the percentage of cell types in each cluster
βββ plotall_ident.png/pdf <- Umap plot of samle
βββ plotby_cluster.png/pdf <- Umap plot of cluster
βββ plotby_ident.png/pdf <- Umap plot splited by sample
βββ plotby_nCount.png/pdf <- Umap plot of nCount
βββ singleR_celltype.xls <- Statistical table of cell types determined by singleR
βββ summary.xls <- The number of cells per cluster in each sample
The single cell expression matrix was reduced and clustering to obtain the final umap plot. The dots represent cells, and clusters split by colors.
The distribution of samples in umap is shown. The dots represent cells, and samples split by colors.
UMAP plot for cell type annotation.
Find marker genes and display result by violin and feature umap.
ββ 05.MarkerGene/
βββ cluster* /
β βββ cluster_genelist.xls <- Marker genes of a cluster
β βββ cluster_genereduction.png/pdf <- Feature plot of top 10 marker genes
β βββ cluster_genevlnplot.png/pdf <- Violin plot of top 10 marker genes
β βββ cluster_padj0.05_logFC0.5genelist.xls <- Filtered marker genes by padj<0.05 and logFC>0.5
βββ custer/
β βββ Celltype
β βββ *_reduction.png/pdf <- Feature plot of the genes entered in the config file
β βββ *_vlnplot.png/pdf <- Violin plot of the genes entered in the config file
βββ cluster_top10geneheatmap.png/pdf <- Heat maps of the top 10 marker genes in all clusters
βββ clusterall_adj0.05_logFC0.5genelist.xls <- All filtered marker genes of all clusters
βββ clusterall_top10genelist.xls <- Top 10 marker genes of all clusters
Umap plot of top10 marker gene for each cluster.
Violin plot of top 10 marker gene for each cluster.
Heatmap of top 10 marker gene for each cluster.
To perform pseudotime analysis of the cells, we used monocle2 to select high discrete gene and draw trajectory diagram. By default, the first branch point is used for beam analysis. If you want to analyze other branch points, setting in the configuration file.
ββ 06.Pseudotime/
βββ pseudotime /
β βββ geneplot.png/pdf <- Scatter plot for selecting high discrete gene
β βββ pseudotime_byclusters.png/pdf <- Cell trajectory plot, colors represent different clusters
β βββ pseudotime_byPseudotime.png/pdf <- Cell trajectory plot, colors represent the level of expression
β βββ pseudotime_bystate.png/pdf <- Cell trajectory plot, colors represent different state
β βββ pseudotime_splitbyclusters.png/pdf <- Cell trajectory plot split by clusters
β βββ pseudotime_splitbyorig.ident.png/pdf <- Cell trajectory plot split by samples
β βββ cell_state_clusters.xls <- Cell state statistical table
βββ BEAM /
β βββ custergene.png/pdf <- Gene expression in different branches of cells
β βββ top10gene.png/pdf <- Top 10 gene expression in different branches of cells
β βββ topgeneheatmap.pdf <- Heatmap for genes in different branches
β βββ diffgenelist.xls <- p-value and q-value of genes
βββ diffgenes /
β βββ topgeneheatmap.pdf <- Heatmap for genes in different branches
β βββ cell_diffgene.xls <- p-value and q-value of genes
βββ QC.png/pdf
Cell trajectory plot drawed by monocle.
Cerebro(cell report browser), which allows users to interactively visualize various parts of single cell transcriptomics data without requiring bioinformatic expertise. Cerebro can draw various graphs to display single cell results like umap/tsne for 2D/3D, bar plot, violin plot, cluster tree, etc.
Copykat is used to perform copy number analysis and predict tumor cells. The umap plot is used to display the results.
Copykat (Copynumber Karyotyping of Tumors) is a computational tool using integrative Bayesian approaches to identify genome-wide aneuploidy at 5MB resolution in single cells to separate tumor cells from normal cells, and tumor subclones using high-throughput sc-RNAseq data.
ββ 08.Copykat/
βββ <SampleName1> /
β βββ <SampleName1>_copykat_clustering_results.rds <- RDS file containing copykat results
β βββ <SampleName1>_copykat_CNA_raw_results_gene_by_cell.txt <- Expression matrix sorted by absolute position of genes
β βββ <SampleName1>_copykat_CNA_results.txt <- Copy number variation of cells at each chromosomal location
β βββ <SampleName1>_copykat_heatmap.jpeg <- Heatmap of tumor cell prediction results
β βββ <SampleName1>_copykat.pdf <- Heatmap of tumor cell prediction results
β βββ <SampleName1>_copykat_prediction.txt <- Tumor cell predicted results of copycat
β βββ <SampleName1>_copykat_with_genes_heatmap.pdf <- Heatmap of tumor cell prediction results with gene names
β βββ <SampleName1>_rawdata.txt <- Raw umi expression matrix
β βββ <SampleName1>.tumor_subtype.pdf <- Heatmap of prediction results of tumor subclonal types
β βββ <SampleName1>.tumor_subtype.txt <- Tumor subclonal type
βββ <SampleName2> /
βββ ...
β
βββ copykat.counts.clusterbarplot.pdf/png <- Barplot of the cell number by clusters, arranged by tumor cells
βββ copykat.counts.samplebarplot.pdf/png <- Barplot of the cell number by samples
βββ copykat.prop.clusterbarplot.pdf/png <- Barplot of the proportion of cell by clusters
βββ copykat.prop.samplebarplot.pdf/png <- Barplot of the proportion of cell by samples
βββ copykat.umap.pdf/png <- Umap plot of tumor cell prediction results
βββ copykat.umap.splitsample.pdf/png <- Umap plot of tumor cell prediction results splited by samples
βββ Summary_copykat_prediction.txt <- Predicted results for all sample cells
Heatmap of copykat prediction results.
CytoTRACE (Cellular (Cyto) Trajectory Reconstruction Analysis using gene Counts and Expression) is a computational method that predicts the differentiation state of cells from single-cell RNA-sequencing data. CytoTRACE leverages a simple, yet robust, determinant of developmental potentialβthe number of detectably expressed genes per cell, or gene counts. CytoTRACE have been validated on ~150K single-cell transcriptomes spanning 315 cell phenotypes, 52 lineages, 14 tissue types, 9 scRNA-seq platforms, and 5 species.
ββ 9.CytoTRACE/
βββ <prefix>.CytoTRACE.boxplot_raw.pdf/png <- Boxplot of cytotrace scores by cluster
βββ <prefix>.CytoTRACE.boxplot_type.pdf/png <- Boxplot of cytotrace scores by cell type
βββ <prefix>.cytovalue.FeaturePlot.pdf/png <- Umap plot characterized by cytotrace scores
βββ <prefix>CytoGenes.pdf <- Genes that most correlated with cytotrace score
βββ <prefix>.CytoTRACE.table.txt <- Table containing cytotrace scores and cell names
Boxplots ordered by median cytotrace score.
Genomic instability analysis (GIA) uses the aREA algorithm to quantitatively estimate the association between gene expression and chromosomal location by performing enrichment analysis of contiguously coded genes (loci-blocks) on the single cell gene expression profiles.
ββ 10.genomicInstability/
βββ <prefix>_genomicinstability.pdf <- Density plot of the gene instability score
βββ <prefix>_genomicinstability.withanno.pdf <- Density plot of the gene instability score with annotated information
βββ <prefix>_gis.result.txt <- Table containing genomic instability score and cell names
The genomic Instability score density plot.
CellChat, a tool that is able to quantitatively infer and analyze intercellular communication networks from single-cell RNA-sequencing (scRNA-seq) data. CellChat predicts major signaling inputs and outputs for cells and how those cells and signals coordinate for functions using network analysis and pattern recognition approaches. Through manifold learning and quantitative contrasts, CellChat classifies signaling pathways and delineates conserved and context-specific pathways across different datasets.
ββ 11.CellChat /
βββ bubble /
β βββ <CellType>_bubble.pdf <- A bubble plot showing the strength of gene communication between cell types
βββ pathway /
β βββ <GeneName>_circle.pdf <- Circle plot of gene signaling pathway network of each gene
β βββ <GeneName>_hier.pdf <- Hierarchy plot of gene signaling pathway network of each gene
βββ <CellType>_circle.pdf <- Circle plot of interacting gene numbers of each cell type
βββ topgeneheatmap.pdf <- Circle plot of gene signaling pathway newtwork
Gene pathway enrichment analysis is to find a class of overexpressed genes in a set of genes. Here, based on five databases including BIOCARTA, BioCyc, GO, KEGG, and reactome, we perform enrichment analysis on the marker genes of the cluster respectively.
ββ 12.clusterProfiler/
βββ <prefix>.BIOCARTA_Enrich.xls <- Gene enrichment analysis results based on BIOCARTA database
βββ <prefix>.BioCyc.png/pdf <- Bubble plot of gene enrichment analysis based on BIOCARTA database
βββ <prefix>.BioCyc_Enrich.xls <- Gene enrichment analysis results based on BioCyc database
βββ <prefix>.BP.DAG.svg <- Directed acyclic graph of biological processes in gene ontology
βββ <prefix>.CC.DAG.svg <- Directed acyclic graph of cellular component in gene ontology
βββ <prefix>.MF.DAG.svg <- Directed acyclic graph of molecular function in gene ontology
βββ <prefix>.go.pdf/png <- Barplot with significantly enriched GO terms
βββ <prefix>.GO_Enrich.xls <- Gene enrichment analysis results based on gene ontology database
βββ <prefix>.KEGG.png/pdf <- Bubble plot of gene enrichment analysis based on KEGG database
βββ <prefix>.KEGG_Enrich.xls <- Gene enrichment analysis results based on KEGG database
βββ <prefix>.PANTHER_Enrich.xls <- Gene enrichment analysis results based on PANTHER database
βββ <prefix>.PID_Enrich.xls <- Gene enrichment analysis results based on PID database
βββ <prefix>.reactome.png/pdf <- Bubble plot of gene enrichment analysis based on reactome database
βββ <prefix>.Reactome_Enrich.xls <- Gene enrichment analysis results based on reactome database
Barplot with significantly enriched GO terms.
The bubble plot for KEGG enriched analysis.
Xu, Limin, Jing Zhang, Yiqian He, Qianqian Yang, Tianhao Mu, Qiushi Guo, Yingqiang Li, Tian Tong, Shifu Chen, and Richard D.Ye. 2023. βScRNAPip: A Systematic and Dynamic Pipeline for SingleβCell RNA Sequencing Analysis.β iMeta e132. https://doi.org/10.1002/imt2.132
[1] Griffiths, Jonathan A., Antonio Scialdone, John C. Marioni. 2018. βUsing single-cell genomics to understand developmental processes and cell fate decisions.β Molecular Systems Biology 14: e8046. https://doi.org/https://doi.org/10.15252/msb.20178046
[2] scRNA-tools. https://www.scrna -tools.org/.
[3] R Core Team (2022). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria.
[4] Chen, Shifu, Yanqing Zhou, Yaru Chen, Jia Gu. 2018. βfastp: an ultra-fast all-in-one FASTQ preprocessor.β Bioinformatics 34: i884-i890. https://doi.org/10.1093/bioinformatics/bty560
[5] Chen,Shifu.2023.βUltrafastone-passFASTQdatapreprocessing,qualitycontrol,and deduplicationusingfastp.βImeta2:e107.https://doi.org/https://doi.org/10.1002/imt2.107
[6] Zheng, Grace X. Y., Jessica M. Terry, Phillip Belgrader, Paul Ryvkin, Zachary W. Bent, Ryan Wilson, Solongo B. Ziraldo, et al. 2017. βMassively parallel digital transcriptional profiling of single cells.β Nature Communications 8: 14049. https://doi.org/10.1038/ncomms14049
[7] Hao, Yuhan, Stephanie Hao, Erica Andersen-Nissen, William M. Mauck, III, Shiwei Zheng, Andrew Butler, Maddie J. Lee, et al. 2021. βIntegrated analysis of multimodal single-cell data.β Cell 184: 3573-3587.e3529. https://doi.org/10.1016/j.cell.2021.04.048
[8] Qiu, Xiaojie, Andrew Hill, Jonathan Packer, Dejun Lin, Yi-An Ma, Cole Trapnell. 2017. βSingle-cell mRNA quantification and differential analysis with Census.β Nature Methods 14: 309-315. https://doi.org/10.1038/nmeth.4150
[9] Dobin, Alexander, Carrie A. Davis, Felix Schlesinger, Jorg Drenkow, Chris Zaleski, Sonali Jha, Philippe Batut, Mark Chaisson, Thomas R. Gingeras. 2013. βSTAR: ultrafast universal RNA-seq aligner.β Bioinformatics 29: 15-21. https://doi.org/10.1093/bioinformatics/bts635
[10] McGinnis, Christopher S., Lyndsay M. Murrow, Zev J. Gartner. 2019. βDoubletFinder: Doublet Detection in Single-Cell RNA Sequencing Data Using Artificial Nearest Neighbors.β Cell Systems 8: 329-337.e324. https://doi.org/10.1016/j.cels.2019.03.003
[11] Anders, Simon, Wolfgang Huber. 2010. βDifferential expression analysis for sequence count data.β Genome Biology 11: R106. https://doi.org/10.1186/gb-2010-11-10-r106
[12] Love, Michael I., Wolfgang Huber, Simon Anders. 2014. βModerated estimation of fold change and dispersion for RNA-seq data with DESeq2.β Genome Biology 15: 550. https://doi.org/10.1186/s13059-014-0550-8
[13] Korsunsky, Ilya, Nghia Millard, Jean Fan, Kamil Slowikowski, Fan Zhang, Kevin Wei, Yuriy Baglaenko, Michael Brenner, Po-ru Loh, Soumya Raychaudhuri. 2019. βFast, sensitive and accurate integration of single-cell data with Harmony.β Nature Methods 16: 1289-1296. https://doi.org/10.1038/s41592-019-0619-0
[14] Haghverdi, Laleh, Aaron T. L. Lun, Michael D. Morgan, John C. Marioni. 2018. βBatch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors.β Nature Biotechnology 36: 421-427. https://doi.org/10.1038/nbt.4091
[15] Tran, Hoa Thi Nhu, Kok Siong Ang, Marion Chevrier, Xiaomeng Zhang, Nicole Yee Shin Lee, Michelle Goh, Jinmiao Chen. 2020. βA benchmark of batch-effect correction methods for single-cell RNA sequencing data.β Genome Biology 21: 12. https://doi.org/10.1186/s13059-019-1850-9
[16] Abdi, HervΓ©, Lynne J. Williams. 2010. βPrincipal component analysis.β WIREs Computational Statistics 2: 433-459. https://doi.org/https://doi.org/10.1002/wics.101
[17] McInnes, Leland, John Healy. 2018. βUMAP: Uniform Manifold Approximation and Projection for Dimension Reduction.β The Journal of Open Source Software 3: 861. https://doi.org/10.48550/arXiv.1802.03426
[18] Becht, Etienne, Leland McInnes, John Healy, Charles-Antoine Dutertre, Immanuel W. H. Kwok, Lai Guan Ng, Florent Ginhoux, Evan W. Newell. 2019. βDimensionality reduction for visualizing single-cell data using UMAP.β Nature Biotechnology 37: 38-44. https://doi.org/10.1038/nbt.4314
[19] Laurens, Van Der Maaten, Geoffrey Hinton. 2008. βVisualizing Data using t-SNE.β Journal of Machine Learning Research 9: 2579-2605.
[20] Blondel, Vincent D., Jean-Loup Guillaume, Renaud Lambiotte, Etienne Lefebvre. 2008. βFast unfolding of communities in large networks.β Journal of Statistical Mechanics: Theory and Experiment 2008: P10008. https://doi.org/10.1088/1742-5468/2008/10/P10008
[21] Aran, Dvir, Agnieszka P. Looney, Leqian Liu, Esther Wu, Valerie Fong, Austin Hsu, Suzanna Chak, et al. 2019. βReference-based analysis of lung single-cell sequencing reveals a transitional profibrotic macrophage.β Nature Immunology 20: 163-172. https://doi.org/10.1038/s41590-018-0276-y
[22] Hillje, Roman, Pier Giuseppe Pelicci, Lucilla Luzi. 2020. βCerebro: interactive visualization of scRNA-seq data.β Bioinformatics 36: 2311-2313. https://doi.org/10.1093/bioinformatics/btz877
[23] Gu, Zuguang, Lei Gu, Roland Eils, Matthias Schlesner, Benedikt Brors. 2014. βcirclize implements and enhances circular visualization in R.β Bioinformatics 30: 2811-2812. https://doi.org/10.1093/bioinformatics/btu393
[24] Wickham, Hadley. 2009. Ggplot2: Elegant Graphics for Data Analysis. https://doi.org/10.1007/978-0-387-98141-3
[25] Gao, Ruli, Shanshan Bai, Ying C. Henderson, Yiyun Lin, Aislyn Schalck, Yun Yan, Tapsi Kumar, et al. 2021. βDelineating copy number and clonal substructure in human tumors from single-cell transcriptomes.β Nature Biotechnology 39: 599-608. https://doi.org/10.1038/s41587-020-00795-2
[26] Gulati, Gunsagar S., Shaheen S. Sikandar, Daniel J. Wesche, Anoop Manjunath, Anjan Bharadwaj, Mark J. Berger, Francisco Ilagan, et al. 2020. βSingle-cell transcriptional diversity is a hallmark of developmental potential.β Science 367: 405-411. https://doi.org/10.1126/science.aax0249
[27] Jin, Suoqin, Christian F. Guerrero-Juarez, Lihua Zhang, Ivan Chang, Raul Ramos, Chen-Hsiang Kuan, Peggy Myung, Maksim V. Plikus, Qing Nie. 2021. βInference and analysis of cell-cell communication using CellChat.β Nature Communications 12: 1088. https://doi.org/10.1038/s41467-021-21246-9
[28] Yu, Guangchuang, Li-Gen Wang, Yanyan Han, Qing-Yu He. 2012. βclusterProfiler: an R Package for Comparing Biological Themes Among Gene Clusters.β OMICS: A Journal of Integrative Biology 16: 284-287. https://doi.org/10.1089/omi.2011.0118
[29] Zhao, Juanjuan, Shuye Zhang, Yang Liu, Xiaomeng He, Mengmeng Qu, Gang Xu, Hongbo Wang, et al. 2020. βSingle-cell RNA sequencing reveals the heterogeneity of liver-resident immune cells in human.β Cell Discovery 6: 22. https://doi.org/10.1038/s41421-020-0157-z