Analysis of scRNA-seq data comparing non-small-cell lung cancer (NSCLC) cells to normal cells

You may need to refresh a couple of times if you see the message "Unable to render code block" when loading the (pdf of the) Jupyter notebook above

I wrote these R scripts to analyze a large, single-cell RNA sequencing (scRNA-seq) dataset (~900,000 cells) using R and various Bioconductor bioinformatics packages.

Overall Functionality:

The scripts accomplish the following tasks:

Load necessary R packages for analysis (Seurat, DESeq2, pheatmap, ggplot2, etc.).
Read scRNA-seq data from an h5 file and converts it into a Seurat object.
Conduct quality control (QC) analysis, filtering out low-quality cells based on gene counts, mitochondrial content, etc.
Generate QC plots to visualize the distribution of various QC metrics.
Perform normalization, variable gene selection, scaling, dimension reduction (UMAP), clustering, and visualization using Seurat's workflows.
Use Azimuth for cell annotation based on the Human Lung Cell Atlas.
Conduct pseudobulk analysis, differential expression analysis (DESeq2), and visualization of significant genes.
Print session information, clears workspace, and manages parallelization using the future package.

Input Files:

Input: Single-cell RNA sequencing data in an h5 file (16plex_900k_32_NSCLC_multiplex_count_filtered_feature_bc_matrix.h5).
Additional CSV files for sample information and annotation.

Required Packages and Tools:

R Packages: Seurat, SeuratDisk, Azimuth, DESeq2, pheatmap, ggplot2, EnhancedVolcano, BPCells, future, dplyr.

Outputs:

Seurat objects at different analysis stages.
QC plots (Violin plots, heatmaps).
Intermediate data files for normalization, clustering, and differential expression analysis.
Visualizations (UMAP plots, volcano plots) highlighting various aspects of the data.

The dataset is from 10X Genomics.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.gitignore		.gitignore
README.md		README.md
Rscript_submit_slurm.sh		Rscript_submit_slurm.sh
scrnaseq_run_azimuth_231128.r		scrnaseq_run_azimuth_231128.r
scrnaseq_sctransform_to_umap_231127.r		scrnaseq_sctransform_to_umap_231127.r
seurat_R_nsclc_10x_900k.ipynb		seurat_R_nsclc_10x_900k.ipynb
seurat_R_nsclc_10x_900k.pdf		seurat_R_nsclc_10x_900k.pdf
seurat_R_nsclc_10x_900k.r		seurat_R_nsclc_10x_900k.r

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Analysis of scRNA-seq data comparing non-small-cell lung cancer (NSCLC) cells to normal cells

Overall Functionality:

Input Files:

Required Packages and Tools:

Outputs:

About

Releases

Packages

Languages

felixm3/scRNA-seq

Folders and files

Latest commit

History

Repository files navigation

Analysis of scRNA-seq data comparing non-small-cell lung cancer (NSCLC) cells to normal cells

Overall Functionality:

Input Files:

Required Packages and Tools:

Outputs:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages