Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rank genes groups wilcoxon test fails for more than 10 million cells #3377

Open
3 tasks done
abs51295 opened this issue Nov 19, 2024 · 0 comments
Open
3 tasks done

Rank genes groups wilcoxon test fails for more than 10 million cells #3377

abs51295 opened this issue Nov 19, 2024 · 0 comments
Labels
Bug 🐛 Triage 🩺 This issue needs to be triaged by a maintainer

Comments

@abs51295
Copy link

Please make sure these conditions are met

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of scanpy.
  • (optional) I have confirmed this bug exists on the main branch of scanpy.

What happened?

Running rank_genes_groups function with a dataset with more than 10 million cells would fail since MAX_SIZE is hardcoded here:

CONST_MAX_SIZE = 10000000

I tried doing it and here's the error:

     72 # Calculate chunk frames
     73 max_chunk = floor(CONST_MAX_SIZE / n_cells)
---> 75 for left in range(0, n_genes, max_chunk):
     76     right = min(left + max_chunk, n_genes)
     78     df = pd.DataFrame(data=get_chunk(X, left, right))

ValueError: range() arg 3 must not be zero

Minimal code sample

sc.tl.rank_genes_groups(adata=adata, groupby='louvain', method='wilcoxon', pts=True, use_raw=False)

Error output

Versions

-----
anndata     0.11.1
scanpy      1.10.4
-----
PIL                         11.0.0
anyio                       NA
arrow                       1.3.0
asttokens                   NA
attr                        24.2.0
attrs                       24.2.0
babel                       2.16.0
backports                   NA
brotli                      1.1.0
cachetools                  5.5.0
certifi                     2024.08.30
cffi                        1.17.1
charset_normalizer          3.4.0
click                       8.1.7
cloudpickle                 3.1.0
colorama                    0.4.6
comm                        0.2.2
cuda                        0+untagged.302.g4a12ae2.dirty
cudf                        24.10.01
cugraph                     24.10.00
cuml                        24.10.00
cupy                        13.3.0
cupy_backends               NA
cupyx                       NA
cycler                      0.12.1
cython_runtime              NA
cytoolz                     1.0.0
dask                        2024.9.0
dask_cuda                   24.10.00
dask_cudf                   24.10.01
dask_expr                   1.1.14
dateutil                    2.9.0.post0
debugpy                     1.8.8
decorator                   5.1.1
defusedxml                  0.7.1
distributed                 2024.9.0
executing                   2.1.0
fastjsonschema              NA
fastrlock                   0.8.2
fqdn                        NA
fsspec                      2024.10.0
google                      NA
h5py                        3.12.1
idna                        3.10
igraph                      0.11.6
ipykernel                   6.29.5
isoduration                 NA
jaraco                      NA
jedi                        0.19.2
jinja2                      3.1.4
joblib                      1.4.2
json5                       0.9.28
jsonpointer                 3.0.0
jsonschema                  4.23.0
jsonschema_specifications   NA
jupyter_events              0.10.0
jupyter_server              2.14.2
jupyterlab_server           2.27.3
kiwisolver                  1.4.7
legacy_api_wrap             NA
leidenalg                   0.10.2
llvmlite                    0.43.0
locket                      NA
louvain                     0.8.2
lz4                         4.3.3
markupsafe                  3.0.2
matplotlib                  3.9.2
matplotlib_inline           0.1.7
more_itertools              10.5.0
mpl_toolkits                NA
msgpack                     1.1.0
natsort                     8.4.0
nbformat                    5.10.4
networkx                    3.4.2
numba                       0.60.0
numpy                       1.26.4
nvtx                        NA
overrides                   NA
packaging                   24.2
pandas                      2.2.2
parso                       0.8.4
patsy                       1.0.1
pickleshare                 0.7.5
pkg_resources               NA
platformdirs                4.3.6
prometheus_client           NA
prompt_toolkit              3.0.48
psutil                      6.1.0
pure_eval                   0.2.3
pyarrow                     17.0.0
pycparser                   2.22
pydev_ipython               NA
pydevconsole                NA
pydevd                      3.2.2
pydevd_file_utils           NA
pydevd_plugins              NA
pydevd_tracing              NA
pygments                    2.18.0
pylibcudf                   NA
pylibcugraph                24.10.00
pylibraft                   24.10.00
pynndescent                 0.5.13
pynvjitlink                 0.4.0
pynvml                      11.4.1
pyparsing                   3.2.0
pythonjsonlogger            NA
pytz                        2024.2
raft_dask                   24.10.00
rapids_dask_dependency      NA
rapids_singlecell           0.10.11
referencing                 NA
requests                    2.32.3
rfc3339_validator           0.1.4
rfc3986_validator           0.1.1
rmm                         24.10.00
rpds                        NA
scipy                       1.14.1
send2trash                  NA
session_info                1.0.0
six                         1.16.0
sklearn                     1.5.2
sniffio                     1.3.1
socks                       1.7.1
sortedcontainers            2.4.0
sparse                      0.15.4
stack_data                  0.6.2
statsmodels                 0.14.4
tblib                       3.0.0
texttable                   1.7.0
threadpoolctl               3.5.0
tlz                         1.0.0
toolz                       1.0.0
torch                       2.4.1.post300
torchgen                    NA
tornado                     6.4.1
tqdm                        4.67.0
traitlets                   5.14.3
treelite                    4.3.0
typing_extensions           NA
umap                        0.5.7
uri_template                NA
urllib3                     2.2.3
wcwidth                     0.2.13
webcolors                   24.8.0
websocket                   1.8.0
yaml                        6.0.2
zict                        3.0.0
zipp                        NA
zmq                         26.2.0
zoneinfo                    NA
zstandard                   0.23.0
-----
IPython             8.29.0
jupyter_client      8.6.3
jupyter_core        5.7.2
jupyterlab          4.2.6
notebook            7.2.2
-----
Python 3.11.10 | packaged by conda-forge | (main, Oct 16 2024, 01:27:36) [GCC 13.3.0]
Linux-4.18.0-348.el8.x86_64-x86_64-with-glibc2.28
-----
Session information updated at 2024-11-19 14:22
@abs51295 abs51295 added Bug 🐛 Triage 🩺 This issue needs to be triaged by a maintainer labels Nov 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug 🐛 Triage 🩺 This issue needs to be triaged by a maintainer
Projects
None yet
Development

No branches or pull requests

1 participant