Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash during Probe Evaluation #233

Closed
pakiessling opened this issue Sep 5, 2022 · 6 comments
Closed

Crash during Probe Evaluation #233

pakiessling opened this issue Sep 5, 2022 · 6 comments
Labels
bug Something isn't working

Comments

@pakiessling
Copy link

Hi,

I tried to select probes and then evaluate them, but spapros crashes during "Probeset specific pre computations" of the evaluations

Code

import spapros as sp
import scanpy as sc

adata = sc.read_h5ad("input/all_prepped.h5ad")
adata.uns['log1p']["base"] = None
selector = sp.selection.ProbesetSelector(adata, n=128, celltype_key="cell_type", verbosity=1, save_dir="/home/mz637064/hpcwork/spapros/testrun4")
selector.select_probeset()
selected_probeset = selector.probeset.index[selector.probeset["selection"]].to_list()

evaluator = sp.ev.ProbesetEvaluator(adata, verbosity=2, results_dir="/home/mz637064/hpcwork/spapros/spapros_evaluate2",celltype_key="cell_type")
evaluator.evaluate_probeset(adata, selected_probeset)

Log

Searching for previous results in /home/mz637064/hpcwork/spapros/testrun4
SPAPROS PROBESET SELECTION:
Select pca genes.......................................... ━━━━━━━ 100% 0:00:15
Train baseline forest based on DE genes................... ━━━━━━━ 4/4 0:36:53
Select DE genes......................................... ━━━━━━━ 11/11 0:00:00
Train prior forest for DE_baseline forest............... ━━━━━━━ 3/3 0:05:26
Iteratively add DE genes to DE_baseline forest.......... ━━━━━━━ 3/3 0:24:07
Train final baseline forest on all celltypes............ ━━━━━━━ 3/3 0:04:27
Train final forests....................................... ━━━━━━━ 3/3 0:18:31
Train forest on pre/prior/pca selected genes............ ━━━━━━━ 3/3 0:05:44
Iteratively add genes from DE_baseline_forest........... ━━━━━━━ 12/12 0:07:47
Train final forest on all celltypes..................... ━━━━━━━ 3/3 0:04:59
Compile probeset list..................................... ━━━━━━━ 100% 0:00:00
FINISHED

SPAPROS PROBESET EVALUATION:
Shared metric computations................................ ━━━━━━━ 3/3 0:29:46
Computing shared compuations for knn_overlap............ ━━━━━━━ 6/6 0:26:59
Computing shared compuations for gene_corr.............. ━━━━━━━ 100% 0:00:47
Probeset specific pre computations........................ 0/3 0:00:00
Traceback (most recent call last):
File "/rwthfs/rz/cluster/hpcwork/mz637064/spapros/evaluate.py", line 11, in
evaluator.evaluate_probeset(adata, selected_probeset)
File "/rwthfs/rz/cluster/home/mz637064/mambaforge/envs/spapros2/lib/python3.9/site-packages/spapros/evaluation/evaluation.py", line 465, in evaluate_probeset
raise error
File "/rwthfs/rz/cluster/home/mz637064/mambaforge/envs/spapros2/lib/python3.9/site-packages/spapros/evaluation/evaluation.py", line 377, in evaluate_probeset
self.pre_results[metric][set_id] = metric_pre_computations(
File "/rwthfs/rz/cluster/home/mz637064/mambaforge/envs/spapros2/lib/python3.9/site-packages/spapros/evaluation/metrics.py", line 186, in metric_pre_computations
return knns(
File "/rwthfs/rz/cluster/home/mz637064/mambaforge/envs/spapros2/lib/python3.9/site-packages/spapros/evaluation/metrics.py", line 700, in knns
a = adata[:, genes].copy()
File "/rwthfs/rz/cluster/home/mz637064/mambaforge/envs/spapros2/lib/python3.9/site-packages/anndata/core/anndata.py", line 1113, in _getitem
oidx, vidx = self._normalize_indices(index)
File "/rwthfs/rz/cluster/home/mz637064/mambaforge/envs/spapros2/lib/python3.9/site-packages/anndata/_core/anndata.py", line 1094, in _normalize_indices
return _normalize_indices(index, self.obs_names, self.var_names)
File "/rwthfs/rz/cluster/home/mz637064/mambaforge/envs/spapros2/lib/python3.9/site-packages/anndata/_core/index.py", line 36, in _normalize_indices
ax1 = _normalize_index(ax1, names1)
File "/rwthfs/rz/cluster/home/mz637064/mambaforge/envs/spapros2/lib/python3.9/site-packages/anndata/_core/index.py", line 107, in _normalize_index
raise IndexError(f"Unknown indexer {indexer!r} of type {type(indexer)}")
IndexError: Unknown indexer AnnData object with n_obs × n_vars = 191795 × 8000
obs: 'sample', 'n_counts', 'n_genes', 'percent_mito', 'doublet_score', 'dissociation_score', 'cell_type_original', 'patient_region_id', 'patient', 'patient_group', 'major_labl', 'final_cluster', 'assay_ontology_term_id', 'development_stage_ontology_term_id', 'disease_ontology_term_id', 'ethnicity_ontology_term_id', 'is_primary_data', 'organism_ontology_term_id', 'sex_ontology_term_id', 'tissue_ontology_term_id', 'cell_type_ontology_term_id', 'cell_type', 'assay', 'disease', 'organism', 'sex', 'tissue', 'ethnicity', 'development_stage'
var: 'feature_biotype', 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
uns: 'X_approximate_distribution', 'X_normalization', 'batch_condition', 'cell_type_ontology_term_id_colors', 'cell_type_original_colors', 'default_embedding', 'hvg', 'log1p', 'schema_version', 'title'
obsm: 'X_harmony', 'X_pca', 'X_umap' of type <class 'anndata._core.anndata.AnnData'>

Setup

Python 3.9.13

Spapros 0.1.0

Scanpy 1.9.1

Anndata 0.8.0

Any Ideas?

@pakiessling pakiessling added the bug Something isn't working label Sep 5, 2022
@LouisK92
Copy link
Collaborator

LouisK92 commented Sep 5, 2022

Hey,

did you try the same code with another adata? E.g.

adata = sp.ut.get_processed_pbmc_data()

(celltype_key="celltype")

just to see if it's a dataset specific problem or a general bug.

@pakiessling
Copy link
Author

Yeah, I get the same index error. Very weird.

@LouisK92
Copy link
Collaborator

LouisK92 commented Sep 5, 2022

Unfortunately I couldn't reproduce the issue yet since I had some rly weird local dependency issues today that I couldn't fix.

I'll try to get to it asap but it'll be a busy week.

If you want to investigate further without waiting I'd recommend to install the spapros dev version (new env, git clone ..., cd to spapros, pip install -e .), and print the genes before the call of a = adata[:, genes].copy() in line 700 in spapros/evaluation/metrics.py. When you then run the evaluation, the first call should give the full adata.var_names (from shared computations step --> no Error here), and in the second call sth weird must happen.

@solvi808
Copy link

I am also getting a similar issue when running evaluate_probeset when running the tutorial pipeline on the pbmc3k dataset.

I did clone the git project, and created a new environment using python3.9, and installed required.txt using pip install -e

The error I am getting

Code

# Probeset (selected with Spapros, see basic selection tutorial)
probeset = [
    'PF4', 'HLA-DPB1', 'FCGR3A', 'GZMB', 'CCL5', 'S100A8', 'IL32', 'HLA-DQA1', 'NKG7', 'AIF1', 'CD79A', 'LTB', 'TYROBP',
    'HLA-DMA', 'GZMK', 'HLA-DRB1', 'FCN1', 'S100A11', 'GNLY', 'GZMH'
]

# Reference probesets
reference_sets = sp.se.select_reference_probesets(adata, n=20)

evaluator = sp.ev.ProbesetEvaluator(adata, verbosity=2, results_dir=None)

evaluator.evaluate_probeset(probeset, set_id="Spapros")

Output

File "/home/user/miniconda3/envs/spapros_env/lib/python3.9/site-packages/xgboost/sklearn.py", line 801, in _duplicated
raise ValueError(

ValueError: 2 different eval_metric are provided. Use the one in constructor or set_params instead.

Setup

scanpy==1.9.1 anndata==0.8.0 umap==0.5.3 numpy==1.23.3 scipy==1.9.1 pandas==1.4.4 scikit-learn==1.1.2 statsmodels==0.13.2 python-igraph==0.9.11 pynndescent==0.5.7

@LouisK92
Copy link
Collaborator

Update to @pakiessling 's issue:
Sorry that I didn't catch this earlier,
evaluator.evaluate_probeset(adata, selected_probeset) must be evaluator.evaluate_probeset(selected_probeset).

I'm also running into @solvi808 's error now (thanks for raising!), will check that one next.

@LouisK92 LouisK92 mentioned this issue Sep 13, 2022
@LouisK92
Copy link
Collaborator

Resolved the issue:

Please reinstall spapros to get the latest version pip install spapros or pip install spapros==0.1.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants