Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add spatial decomposition task #902

Open
25 tasks
danielStrobl opened this issue Aug 30, 2023 · 4 comments
Open
25 tasks

Add spatial decomposition task #902

danielStrobl opened this issue Aug 30, 2023 · 4 comments
Labels
task Add a new task

Comments

@danielStrobl
Copy link
Collaborator

Task motivation

Spatial decomposition (also often referred to as Spatial deconvolution) is applicable to spatial transcriptomics data where the transcription profile of each capture location (spot, voxel, bead, etc.) do not share a bijective relationship with the cells in the tissue, i.e., multiple cells may contribute to the same capture location. The task of spatial decomposition then refers to estimating the composition of cell types/states that are present at each capture location.

Task description

No response

Proposed ground-truth in datasets

  • DestVI3: scRNA-seq is generated based on learn NB parameters from the destVI manuscripts leveraging sparsePCA. Number of cells and cell types present in each spatial spot is computed via combination of kernel-based parametrization of a categorical distribution and the NB model.

  • Pancreas (alpha=0.5)11: Human pancreas cells aggregated from single-cell (Dirichlet alpha=0.5).

  • Pancreas (alpha=1)11: Human pancreas cells aggregated from single-cell (Dirichlet alpha=1).

  • Pancreas (alpha=5)11: Human pancreas cells aggregated from single-cell (Dirichlet alpha=5).

  • Tabula muris senis (alpha=0.5)12: Mouse lung cells aggregated from single-cell (Dirichlet alpha=0.5).

  • Tabula muris senis (alpha=1)12: Mouse lung cells aggregated from single-cell (Dirichlet alpha=1).

  • Tabula muris senis (alpha=5)12: Mouse lung cells aggregated from single-cell (Dirichlet alpha=5).

Initial set of methods to implement

  • Cell2location (alpha=20, amortised, hard-coded)2: Cell2location is a decomposition method based on Negative Binomial regression that is able to account for batch effects in estimating the single-cell gene expression signature used for the spatial decomposition step. Note that since batch information is unavailable in this task, here we use either a hard-coded reference, or a negative-binomial learned reference without batch labels. The parameter alpha refers to the detection efficiency prior. Links: Docs.

  • Cell2location (alpha=1, reference hard-coded)2: Cell2location is a decomposition method based on Negative Binomial regression that is able to account for batch effects in estimating the single-cell gene expression signature used for the spatial decomposition step. Note that since batch information is unavailable in this task, here we use either a hard-coded reference, or a negative-binomial learned reference without batch labels. The parameter alpha refers to the detection efficiency prior. Links: Docs.

  • Cell2location (alpha=20, reference hard-coded)2: Cell2location is a decomposition method based on Negative Binomial regression that is able to account for batch effects in estimating the single-cell gene expression signature used for the spatial decomposition step. Note that since batch information is unavailable in this task, here we use either a hard-coded reference, or a negative-binomial learned reference without batch labels. The parameter alpha refers to the detection efficiency prior. Links: Docs.

  • Cell2location (alpha=200, reference hard-coded)2: Cell2location is a decomposition method based on Negative Binomial regression that is able to account for batch effects in estimating the single-cell gene expression signature used for the spatial decomposition step. Note that since batch information is unavailable in this task, here we use either a hard-coded reference, or a negative-binomial learned reference without batch labels. The parameter alpha refers to the detection efficiency prior. Links: Docs.

  • Cell2location (alpha=20, NB reference)2: Cell2location is a decomposition method based on Negative Binomial regression that is able to account for batch effects in estimating the single-cell gene expression signature used for the spatial decomposition step. Note that since batch information is unavailable in this task, here we use either a hard-coded reference, or a negative-binomial learned reference without batch labels. The parameter alpha refers to the detection efficiency prior. Links: Docs.

  • DestVI3: destVI is a decomposition method that leverages a conditional generative model of spatial transcriptomics down to the sub-cell-type variation level, which is then used to decompose the cell-type proportions determining the spatial organization of a tissue. Links: Docs.

  • Non-Negative Matrix Factorization (NMF)7: NMF is a decomposition method based on Non-negative Matrix Factorization (NMF) that reconstructs expression of each spatial location as a weighted combination of cell-type signatures defined by scRNA-seq. It is a simpler baseline than NMFreg as it only performs the NMF step based on mean expression signatures of cell types, returning the weights loading of the NMF as (normalized) cell type proportions, without the regression step. Links: Docs.

  • NMF-reg8: NMFreg is a decomposition method based on Non-negative Matrix Factorization Regression (NMFreg) that reconstructs expression of each spatial location as a weighted combination of cell-type signatures defined by scRNA-seq. It was originally developed for Slide-seq data. Links: Docs.

  • Non-Negative Least Squares4: NNLS13 is a decomposition method based on Non-Negative Least Square Regression (NNLS). It was originally introduced by the method AutoGenes. Links: Docs.

  • Random Proportions13: Random assignment of predicted celltype proportions from a Dirichlet distribution. Links: Docs.

  • RCTD5: RCTD (Robust Cell Type Decomposition) is a decomposition method that uses signatures learnt from single-cell data to decompose spatial expression of tissues. It is able to platform effect normalization step, which normalizes the scRNA-seq cell type profiles to match the platform effects of the spatial transcriptomics dataset. Links: Docs.

  • SeuratV39: SeuratV3 is a decomposition method that is based on Canonical Correlation Analysis (CCA). Links: Docs.

  • Stereoscope6: Stereoscope is a decomposition method based on Negative Binomial regression. It is similar in scope and implementation to cell2location but less flexible to incorporate additional covariates such as batch effects and other type of experimental design annotations. Links: Docs.

  • Tangram10: Tangram is a method to map gene expression signatures from scRNA-seq data to spatial data. It performs the cell type mapping by learning a similarity matrix between single-cell and spatial locations based on gene expression profiles. Links: Docs.

  • True Proportions13: Perfect assignment of predicted celltype proportions from the ground truth. Links: Docs.

Proposed control methods

  • Random Proportions: Random assignment of predicted celltype proportions from a Dirichlet distribution.
  • True Proportions: Perfect assignment of predicted celltype proportions from the ground truth.

Proposed Metrics

  • r2: R2, or the “coefficient of determination”, reports the fraction of the true proportion values’ variance that can be explained by the predicted proportion values. The best score, and upper bound, is 1.0. There is no fixed lower bound for the metric. The uniform/non-weighted average across all cell types/states is used to summarize performance.
@danielStrobl danielStrobl added the task Add a new task label Aug 30, 2023
@mumichae
Copy link
Contributor

mumichae commented Nov 29, 2023

After looking at the task, it seems to me that this workflow goes beyond the linear paradigm of dataset prep -> method -> metric, since you require a reference for the signature matrix as well as the query "spatial" matrix. So defining the reference /spatial split is a type of simulation and it would require a similar workflow to the label projection task

@mumichae mumichae reopened this Nov 29, 2023
@mumichae
Copy link
Contributor

I added some suggestions on a workflow for spatial decomposition here: openproblems-bio/openproblems-v2#292

@rcannood rcannood transferred this issue from openproblems-bio/openproblems-v2 Sep 8, 2024
@rcannood
Copy link
Member

rcannood commented Sep 8, 2024

This task has been moved to task_spatial_decomposition.

@rcannood
Copy link
Member

rcannood commented Sep 8, 2024

@danielStrobl This task is almost ready for a release. Would you like to set up a meeting to discuss next steps?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
task Add a new task
Projects
None yet
Development

No branches or pull requests

3 participants