Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bat_int] cell cycle genes not in adata for cellxgene_census dataset #331

Open
KaiWaldrant opened this issue Jan 8, 2024 · 3 comments
Open
Labels
batch_integration relates to task batch_integration bug Something isn't working

Comments

@KaiWaldrant
Copy link
Member

Describe the bug
The Metric cell_cycle_conservation fails with datasets from the Cell x Gene Census:

ValueError: cell cycle genes not in adata
 organism: human
 varnames: ['ENSG00000105792', 'ENSG00000128253', 'ENSG00000015413', 'ENSG00000164402', 'ENSG00000246375', 'ENSG00000176402', 'ENSG00000022976', 'ENSG00000123191', 'ENSG00000198283', 'ENSG00000092020']

To Reproduce
https://tower.nf/orgs/openproblems-bio/workspaces/openproblems-bio/watch/ma2LsRoQarR8Z

Expected behavior
A clear and concise description of what you expected to happen.

Additional context
Add any other context about the problem here.

@KaiWaldrant KaiWaldrant added bug Something isn't working batch_integration relates to task batch_integration labels Jan 8, 2024
@mumichae
Copy link
Collaborator

Hi, so this issue comes from the fact that the cell cycle genes used are available as gene names, not Ensembl IDs. However CxG uses Ensembl IDs in the var names. I would suggest to overwrite the var_names with adata.var["feature_name"], if that column exists during the processing. Does that sound reasonable?

@rcannood
Copy link
Member

I tend to prefer to set the var names to emsembl ids instead of the gene names, because otherwise there are duplicate var names. WDYT?

@mumichae
Copy link
Collaborator

In general that makes sense, but for the cell cycle metric we would still need gene symbols. Would you prefer to rename the var_names only for the metric instead?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
batch_integration relates to task batch_integration bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants