Skip to content

Latest commit

 

History

History
37 lines (31 loc) · 1.96 KB

10X_refinement.md

File metadata and controls

37 lines (31 loc) · 1.96 KB

10X Refinement Workflow

The workflow script that runs the tools is workflows/kf_single_cell_10x_refinement.cwl

SoupX is used for subtraction of the RNA background. scDblFinder is used to score and predict doublets. Decontaminated outputs are aggregated using the Seurat R package from the Satija lab at the New York Genome Center. Original workflow design heavily contributed to by Erin Reichenbee of DBHi.

Software

  • SoupX 1.6.2
  • scDblFinder 1.12.0
  • Seurat 4.3.0.1
  • SeuratObject 4.1.3
  • tidyverse docker base 4.2.3

Inputs

multi-step

  • output_basename: basename used to name output files
  • sample_name: used as prefix for finding fastqs to analyze, e.g. 1k_PBMCs_TotalSeq_B_3p_LT_antibody if the names of the underlying fastqs are of the form 1k_PBMCs_TotalSeq_B_3p_LT_antibody_S1_L001_I1_001.fastq.gz, one per input fastq in the same order

soupX

  • counts_matrix_raw: h5 format raw feature matrix file from Cellranger or equivalent
  • counts_matrix_filtered: h5 format filtered feature matrix file from Cellranger or equivalent
  • counts_cluster: CSV containing cluster information from Cellranger or equivalent if available

scDblFinder

  • align_qc_rds: Align QC file frm D3b 10X alignment workflow
  • seurat_raw_rds: Seurat raw rds file from D3b 10X alignment workflow

Outputs

  • soupx_rplots: PDF R plot made by soupX
  • soupx_rds: R object with SoupX results
  • scdblfinder_plot: PDF cluster plots generated by scDblFinder
  • scdblfinder_doublets: TSV containing scoring matrix with doublets marked by scDblFinder
  • scdblfinder_summary: Summary stats of number and percent of doublets in library
  • seurat_filtered_rds: RDS file containing filtered 10X data based on Seurat QC (align workflow) SoupX and scDblFinder results
  • seurat_filtered_summary: TSV file summarizing number of cells removed at each step, if relevant