-
Notifications
You must be signed in to change notification settings - Fork 604
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integration of dorothea and progeny #1724
Comments
Hi @PauBadiaM, I have always viewed Dorothea and Progeny as methods to aid in the interpretation of my data. Hence, I would assume this might be most useful as a targeted approach to plot activity of a particular TF or pathway. This is something I would probably find most useful as a function where i can either ask for the activity of a single TF/pathway or to get the activity score that explains most variation/correlates with a particular PC. Hence I would err on the side of storing the activities in |
I think making access to entires in Am I correct in understanding that being able to things like: adata.obsm["pathways"] = pathway_dataframe_func(adata)
sc.pl.heatmap(adata, groupby="leiden", obsm="pathway")
sc.pl.umap(adata, color=["pathways/pathway-1", "leiden"]) would solve most of the barriers you're facing? |
Thanks for the quick responses @LuckyMD and @ivirshup. |
on the same line, we wrote a very simple see for instance a usage example here: https://squidpy.readthedocs.io/en/latest/auto_examples/image/compute_texture_features.html#sphx-glr-auto-examples-image-compute-texture-features-py I think what you guys are working in scverse/anndata#342 has much broader scope, and in general more useful for multi modal data etc. but if you think |
I was thinking a quick thing to do would be to add an |
Thank you all for the feedback! In the end the best solution has been to store activities in Now that both tools are AnnData compatible, should I open a pull request to add them into the Ecosystem? |
Please do! |
Hi everyone,
Seeing how many new single cell and spatial tools are being developed in Python, and how we are increasingly using it in general and scanpy in particular, at saezlab we decided to re-implement our tools to estimate pathways and Transcription factor (TF) activity (Dorothea and Progeny) in it. Here's a first draft in Python of our tools:
https://github.com/saezlab/dorothea-py
https://github.com/saezlab/progeny-py
Our tools take gene expression as input and generate matrices of TF and pathway activities. They can be understood as:
obsm
). Examples of usage:X
). Examples of usage:Because of this duality, the integration of our tools into scanpy is not straightforward. If we store the activities in
obsm
they can be used as a dimensonality reduction embedding but then we lose acces to all the fantastic plotting functions based onX
. Then if we add add our activities toX
, they have a very different distribution than gene expression plus there would be an overlap of names between genes and TFs. A solution to this would be to have a separate.layer
to store this matrices but layers must contain the same dimensions asX
. Another workaround would be to store it in.raw
but then we force the user to use remove its previous contents, plus it is used in some methods as default which could cause problems.What would be a smart solution to integrate our tools in your universe?
The text was updated successfully, but these errors were encountered: