bm.plot_results_table() fails when all batch correction metrics are set to False #157

soerenab · 2024-03-20T11:23:12Z

Report

I only want to run the bio conservation metrics, so I have initialized the Benchmarker as follows:

batch_correction = BatchCorrection(
    silhouette_batch=False,
    ilisi_knn=False,
    kbet_per_label=False,
    graph_connectivity=False,
    pcr_comparison=False
)

bio_conservation = BioConservation(
    isolated_labels=True,
    nmi_ari_cluster_labels_leiden=False,
    nmi_ari_cluster_labels_kmeans=True,
    silhouette_label=True,
    clisi_knn=True
)

bm = Benchmarker(
    adata,
    batch_key="dummy_batch",
    label_key="cell_type",
    embedding_obsm_keys=embedding_obsm_keys,
    batch_correction_metrics=batch_correction,
    bio_conservation_metrics=bio_conservation,
    n_jobs=6,
)

Then, bm.benchmark() runs fine, however, when I want to plot the results with bm.plot_results_table() I get the below error:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
File ~/miniconda3/envs/contrastive-transformer/lib/python3.10/site-packages/pandas/core/indexes/base.py:3802, in Index.get_loc(self, key, method, tolerance)
   3801 try:
-> 3802     return self._engine.get_loc(casted_key)
   3803 except KeyError as err:

File ~/miniconda3/envs/contrastive-transformer/lib/python3.10/site-packages/pandas/_libs/index.pyx:138, in pandas._libs.index.IndexEngine.get_loc()

File ~/miniconda3/envs/contrastive-transformer/lib/python3.10/site-packages/pandas/_libs/index.pyx:165, in pandas._libs.index.IndexEngine.get_loc()

File pandas/_libs/hashtable_class_helper.pxi:5745, in pandas._libs.hashtable.PyObjectHashTable.get_item()

File pandas/_libs/hashtable_class_helper.pxi:5753, in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'Batch correction'

The above exception was the direct cause of the following exception:

KeyError                                  Traceback (most recent call last)
Cell In[11], line 1
----> 1 bm.plot_results_table()

File ~/miniconda3/envs/contrastive-transformer/lib/python3.10/site-packages/scib_metrics/benchmark/_core.py:287, in Benchmarker.plot_results_table(self, min_max_scale, show, save_dir)
    285 num_embeds = len(self._embedding_obsm_keys)
    286 cmap_fn = lambda col_data: normed_cmap(col_data, cmap=matplotlib.cm.PRGn, num_stds=2.5)
--> 287 df = self.get_results(min_max_scale=min_max_scale)
    288 # Do not want to plot what kind of metric it is
    289 plot_df = df.drop(_METRIC_TYPE, axis=0)

File ~/miniconda3/envs/contrastive-transformer/lib/python3.10/site-packages/scib_metrics/benchmark/_core.py:266, in Benchmarker.get_results(self, min_max_scale, clean_names)
    264 per_class_score = df.groupby(_METRIC_TYPE).mean().transpose()
    265 # This is the default scIB weighting from the manuscript
--> 266 per_class_score["Total"] = 0.4 * per_class_score["Batch correction"] + 0.6 * per_class_score["Bio conservation"]
    267 df = pd.concat([df.transpose(), per_class_score], axis=1)
    268 df.loc[_METRIC_TYPE, per_class_score.columns] = _AGGREGATE_SCORE

File ~/miniconda3/envs/contrastive-transformer/lib/python3.10/site-packages/pandas/core/frame.py:3807, in DataFrame.__getitem__(self, key)
   3805 if self.columns.nlevels > 1:
   3806     return self._getitem_multilevel(key)
-> 3807 indexer = self.columns.get_loc(key)
   3808 if is_integer(indexer):
   3809     indexer = [indexer]

File ~/miniconda3/envs/contrastive-transformer/lib/python3.10/site-packages/pandas/core/indexes/base.py:3804, in Index.get_loc(self, key, method, tolerance)
   3802     return self._engine.get_loc(casted_key)
   3803 except KeyError as err:
-> 3804     raise KeyError(key) from err
   3805 except TypeError:
   3806     # If we have a listlike key, _check_indexing_error will raise
   3807     #  InvalidIndexError. Otherwise we fall through and re-raise
   3808     #  the TypeError.
   3809     self._check_indexing_error(key)

KeyError: 'Batch correction'

My guess is that the following line fails as I did not run any batch integration metrics
per_class_score["Total"] = 0.4 * per_class_score["Batch correction"] + 0.6 * per_class_score["Bio conservation"]
so the per_class_score is probably not defined.

It would be great to be able to plot the results even if I did not run any batch integration metric. A potential solution in this case could be to simply not compute (and plot) the total score as it probably does not make sense anyways, or to set per_class_score["Batch correction"] to per_class_score["Bio conservation"] so that the final score is simply the bio conservation score.)

Version information

anndata 0.10.5
numpy 1.26.4
pandas 1.5.3
scanpy 1.9.8
scib 1.1.4
scib_metrics 0.5.1
session_info 1.0.0

PIL 10.2.0
absl NA
anyio NA
arrow 1.3.0
asttokens NA
attr 23.2.0
attrs 23.2.0
babel 2.14.0
brotli 1.1.0
certifi 2024.02.02
cffi 1.16.0
charset_normalizer 3.3.2
chex 0.1.85
cloudpickle 3.0.0
colorama 0.4.6
comm 0.2.1
cycler 0.12.1
cython_runtime NA
cytoolz 0.12.3
dask 2024.2.0
dateutil 2.8.2
debugpy 1.8.1
decorator 5.1.1
defusedxml 0.7.1
deprecated 1.2.14
exceptiongroup 1.2.0
executing 2.0.1
fastjsonschema NA
fqdn NA
google NA
h5py 3.10.0
idna 3.6
igraph 0.11.4
importlib_metadata NA
ipykernel 6.29.2
ipywidgets 8.1.2
isoduration NA
jax 0.4.25
jaxlib 0.4.25
jedi 0.19.1
jinja2 3.1.3
joblib 1.3.2
json5 NA
jsonpointer 2.4
jsonschema 4.21.1
jsonschema_specifications NA
jupyter_events 0.9.0
jupyter_server 2.12.5
jupyterlab_server 2.25.3
kiwisolver 1.4.5
leidenalg 0.10.2
llvmlite 0.42.0
lz4 4.3.3
markupsafe 2.1.5
matplotlib 3.8.3
ml_dtypes 0.3.2
mpl_toolkits NA
natsort 8.4.0
nbformat 5.9.2
numba 0.59.0
opt_einsum v3.3.0
overrides NA
packaging 23.2
parso 0.8.3
patsy 0.5.6
pickleshare 0.7.5
platformdirs 4.2.0
plottable 0.1.5
prometheus_client NA
prompt_toolkit 3.0.42
psutil 5.9.8
pure_eval 0.2.2
pyarrow 15.0.0
pycparser 2.21
pydev_ipython NA
pydevconsole NA
pydevd 2.9.5
pydevd_file_utils NA
pydevd_plugins NA
pydevd_tracing NA
pydot 2.0.0
pygments 2.17.2
pynndescent 0.5.11
pyparsing 3.1.1
pythonjsonlogger NA
pytz 2024.1
referencing NA
requests 2.31.0
rfc3339_validator 0.1.4
rfc3986_validator 0.1.1
rich NA
rpds NA
scipy 1.12.0
seaborn 0.13.2
send2trash NA
six 1.16.0
sklearn 1.4.1.post1
sniffio 1.3.0
socks 1.7.1
stack_data 0.6.2
statsmodels 0.14.1
tblib 3.0.0
texttable 1.7.0
threadpoolctl 3.3.0
tlz 0.12.3
toolz 0.12.1
torch 2.2.1
torchgen NA
tornado 6.4
tqdm 4.66.2
traitlets 5.14.1
typing_extensions NA
umap 0.5.5
uri_template NA
urllib3 2.2.1
wcwidth 0.2.13
webcolors 1.13
websocket 1.7.0
wrapt 1.16.0
yaml 6.0.1
zipp NA
zmq 25.1.2
zoneinfo NA

IPython 8.22.0
jupyter_client 8.6.0
jupyter_core 5.7.1
jupyterlab 4.1.2
notebook 7.1.0

Python 3.10.13 | packaged by conda-forge | (main, Dec 23 2023, 15:36:39) [GCC 12.3.0]
Linux-4.18.0-513.11.1.el8_9.x86_64-x86_64-with-glibc2.28

Session information updated at 2024-03-20 12:13

The text was updated successfully, but these errors were encountered:

martinkim0 · 2024-03-20T16:02:54Z

Thanks for bringing this up! I'll take a look at this

LArnoldt · 2024-08-18T16:09:34Z

Hey @martinkim0 - Thank you for this great package!

The issue described above also appplies to the function get_results in _core.py, since function can't calculate:

per_class_score["Total"] = 0.4 * per_class_score["Batch correction"] + 0.6 * per_class_score["Bio conservation"]

since either "Batch correction" or "Bio conservation" is not available in df per_class_score, when disabling all metrics.

adamgayoso · 2024-08-21T20:59:55Z

Hi @LArnoldt -- we are happy to accept a pull request to fix this.

Perhaps to the batch correction and bio conservation dataclasses we can add a helper fn that counts how many metrics are active. This can be then be used to control the plotting code and the total score.

SidSouthekal-Lilly · 2024-09-06T17:45:46Z

same issue with get_results() and plot_results_table() if either Bio conservation or Batch corrrection is set to False. Any fix yet ?

LArnoldt · 2024-09-06T21:18:54Z

Hi @adamgayoso @SidSouthekal-Lilly - please see PR ##179.

soerenab added the bug Something isn't working label Mar 20, 2024

martinkim0 self-assigned this Mar 20, 2024

LArnoldt mentioned this issue Sep 6, 2024

Make get_results work, if either no BatchCorrection or BioConservation metrics. #179

Open

adamgayoso mentioned this issue Nov 3, 2024

Change behavior of None default metrics and allow results/plotting to work with bio/batch separately. #181

Merged

adamgayoso closed this as completed in #181 Nov 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bm.plot_results_table() fails when all batch correction metrics are set to False #157

bm.plot_results_table() fails when all batch correction metrics are set to False #157

soerenab commented Mar 20, 2024

martinkim0 commented Mar 20, 2024

LArnoldt commented Aug 18, 2024

adamgayoso commented Aug 21, 2024

SidSouthekal-Lilly commented Sep 6, 2024

LArnoldt commented Sep 6, 2024 •

edited

Loading

bm.plot_results_table() fails when all batch correction metrics are set to False #157

bm.plot_results_table() fails when all batch correction metrics are set to False #157

Comments

soerenab commented Mar 20, 2024

Report

Version information

anndata 0.10.5 numpy 1.26.4 pandas 1.5.3 scanpy 1.9.8 scib 1.1.4 scib_metrics 0.5.1 session_info 1.0.0

IPython 8.22.0 jupyter_client 8.6.0 jupyter_core 5.7.1 jupyterlab 4.1.2 notebook 7.1.0

Python 3.10.13 | packaged by conda-forge | (main, Dec 23 2023, 15:36:39) [GCC 12.3.0] Linux-4.18.0-513.11.1.el8_9.x86_64-x86_64-with-glibc2.28

martinkim0 commented Mar 20, 2024

LArnoldt commented Aug 18, 2024

adamgayoso commented Aug 21, 2024

SidSouthekal-Lilly commented Sep 6, 2024

LArnoldt commented Sep 6, 2024 • edited Loading

anndata 0.10.5
numpy 1.26.4
pandas 1.5.3
scanpy 1.9.8
scib 1.1.4
scib_metrics 0.5.1
session_info 1.0.0

IPython 8.22.0
jupyter_client 8.6.0
jupyter_core 5.7.1
jupyterlab 4.1.2
notebook 7.1.0

Python 3.10.13 | packaged by conda-forge | (main, Dec 23 2023, 15:36:39) [GCC 12.3.0]
Linux-4.18.0-513.11.1.el8_9.x86_64-x86_64-with-glibc2.28

LArnoldt commented Sep 6, 2024 •

edited

Loading