Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: No objects to concatenate #52

Closed
nick-youngblut opened this issue Mar 21, 2021 · 4 comments
Closed

ValueError: No objects to concatenate #52

nick-youngblut opened this issue Mar 21, 2021 · 4 comments

Comments

@nick-youngblut
Copy link

Running instrain on 272 metagenomes:

Scaffold to bin was made using .stb file
***************************************************
    ..:: inStrain compare Step 1. Load data ::..
***************************************************

Loading Profiles into RAM: 100%|██████████| 272/272 [05:25<00:00,  1.20s/it]
36 of 36 scaffolds are in at least 2 samples
***************************************************
..:: inStrain compare Step 2. Run comparisons ::..
***************************************************

Running group 1 of 1
Comparing scaffolds: 100%|██████████| 36/36 [94:23:25<00:00, 9439.05s/it]
***************************************************
..:: inStrain compare Step 3. Auxiliary processing ::..
***************************************************

Cannot cluster genome AZ482__metabat2__Low_004.fna; 102 of 35511 comaprisons involve no genomic overlap at all: see log for more
Could not cluster genomes; heres a traceback:
Traceback (most recent call last):
  File "/ebio/abt3_projects/Anxiety_Twins_Metagenomes/bin/llmgps/.snakemake/conda/d20af029/lib/python3.8/site-packages/inStrain/compare_controller.py", line 356, in run_genome_clustering
    Cdb = inStrain.compare_utils.cluster_genome_strains(Mdb, kwargs)
  File "/ebio/abt3_projects/Anxiety_Twins_Metagenomes/bin/llmgps/.snakemake/conda/d20af029/lib/python3.8/site-packages/inStrain/compare_utils.py", line 234, in cluster_genome_strains
    return pd.concat(cdbs).reset_index(drop=True)
  File "/ebio/abt3_projects/Anxiety_Twins_Metagenomes/bin/llmgps/.snakemake/conda/d20af029/lib/python3.8/site-packages/pandas/core/reshape/concat.py", line 285, in concat
    op = _Concatenator(
  File "/ebio/abt3_projects/Anxiety_Twins_Metagenomes/bin/llmgps/.snakemake/conda/d20af029/lib/python3.8/site-packages/pandas/core/reshape/concat.py", line 342, in __init__
    raise ValueError("No objects to concatenate")
ValueError: No objects to concatenate
***************************************************
..:: inStrain compare Step 4. Store results ::..
***************************************************

My conda env:

# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       1_gnu    conda-forge
asteval                   0.9.22             pyhd8ed1ab_0    conda-forge
attrs                     20.3.0             pyhd3deb0d_0    conda-forge
biopython                 1.74             py38h516909a_0    conda-forge
boost                     1.70.0           py38h9de70de_1    conda-forge
boost-cpp                 1.70.0               h7b93d67_3    conda-forge
bzip2                     1.0.8                h7f98852_4    conda-forge
c-ares                    1.17.1               h36c2ea0_0    conda-forge
ca-certificates           2021.1.19            h06a4308_0
cached-property           1.5.2                      py_0
capnproto                 0.6.1                hfc679d8_1    conda-forge
certifi                   2020.12.5        py38h578d9bd_1    conda-forge
cycler                    0.10.0                     py_2    conda-forge
decorator                 4.4.2                      py_0    conda-forge
drep                      3.0.1                      py_0    bioconda
fastani                   1.32                 he1c1bb9_0    bioconda
freetype                  2.10.4               h0708190_1    conda-forge
future                    0.18.2           py38h578d9bd_3    conda-forge
gsl                       2.6                  he838d99_2    conda-forge
h5py                      3.1.0           nompi_py38hafa665b_100    conda-forge
hdf5                      1.10.6          nompi_h6a2412b_1114    conda-forge
htslib                    1.11                 hd3b49d5_1    bioconda
icu                       67.1                 he1b5a44_0    conda-forge
iniconfig                 1.1.1              pyh9f0ad1d_0    conda-forge
instrain                  1.5.1                      py_0    bioconda
joblib                    1.0.1              pyhd8ed1ab_0    conda-forge
jpeg                      9d                   h516909a_0    conda-forge
kiwisolver                1.3.1            py38h1fd1430_1    conda-forge
krb5                      1.17.2               h926e7f8_0    conda-forge
lcms2                     2.12                 hddcbb42_0    conda-forge
ld_impl_linux-64          2.35.1               hea4e1c9_2    conda-forge
libblas                   3.9.0                8_openblas    conda-forge
libcblas                  3.9.0                8_openblas    conda-forge
libcurl                   7.71.1               hcdd3856_8    conda-forge
libdeflate                1.6                  h516909a_0    conda-forge
libedit                   3.1.20191231         he28a2e2_2    conda-forge
libev                     4.33                 h516909a_1    conda-forge
libffi                    3.3                  h58526e2_2    conda-forge
libgcc-ng                 9.3.0               h2828fa1_18    conda-forge
libgfortran-ng            9.3.0               hff62375_18    conda-forge
libgfortran5              9.3.0               hff62375_18    conda-forge
libgomp                   9.3.0               h2828fa1_18    conda-forge
liblapack                 3.9.0                8_openblas    conda-forge
libllvm10                 10.0.1               he513fc3_3    conda-forge
libnghttp2                1.43.0               h812cca2_0    conda-forge
libopenblas               0.3.12          pthreads_h4812303_1    conda-forge
libpng                    1.6.37               hed695b0_2    conda-forge
libssh2                   1.9.0                hab1572f_5    conda-forge
libstdcxx-ng              9.3.0               h6de172a_18    conda-forge
libtiff                   4.2.0                hdc55705_0    conda-forge
libwebp-base              1.2.0                h7f98852_0    conda-forge
llvmlite                  0.35.0           py38h4630a5e_1    conda-forge
lmfit                     1.0.2              pyhd8ed1ab_0    conda-forge
lz4-c                     1.9.3                h9c3ff4c_0    conda-forge
mash                      2.2.2                ha61e061_2    bioconda
matplotlib-base           3.3.4            py38h0efea84_0    conda-forge
more-itertools            8.7.0              pyhd8ed1ab_0    conda-forge
mummer4                   4.0.0rc1        pl526he1b5a44_0    bioconda
ncurses                   6.2                  h58526e2_4    conda-forge
networkx                  2.5                        py_0    conda-forge
numba                     0.52.0           py38h51da96c_0    conda-forge
numpy                     1.20.1           py38h18fd61f_0    conda-forge
olefile                   0.46               pyh9f0ad1d_1    conda-forge
openssl                   1.1.1j               h7f98852_0    conda-forge
packaging                 20.9               pyh44b312d_0    conda-forge
pandas                    1.2.2            py38h51da96c_0    conda-forge
patsy                     0.5.1                      py_0    conda-forge
perl                      5.26.2            h36c2ea0_1008    conda-forge
pillow                    8.1.0            py38ha0e1e83_2    conda-forge
pip                       21.0.1             pyhd8ed1ab_0    conda-forge
pluggy                    0.13.1           py38h578d9bd_4    conda-forge
prodigal                  2.6.3                h516909a_2    bioconda
psutil                    5.8.0            py38h497a2fe_1    conda-forge
py                        1.10.0             pyhd3deb0d_0    conda-forge
pyparsing                 2.4.7              pyh9f0ad1d_0    conda-forge
pysam                     0.16.0.1         py38hbdc2ae9_1    bioconda
pytest                    6.2.2            py38h578d9bd_0    conda-forge
python                    3.8.6           hffdb5ce_5_cpython    conda-forge
python-dateutil           2.8.1                      py_0    conda-forge
python_abi                3.8                      1_cp38    conda-forge
pytz                      2021.1             pyhd8ed1ab_0    conda-forge
readline                  8.1                  h27cfd23_0
samtools                  1.11                 h6270b1f_0    bioconda
scikit-learn              0.24.1           py38h658cfdd_0    conda-forge
scipy                     1.6.0            py38hb2138dd_0    conda-forge
seaborn                   0.11.1               ha770c72_0    conda-forge
seaborn-base              0.11.1             pyhd8ed1ab_1    conda-forge
setuptools                52.0.0           py38h06a4308_0
six                       1.15.0             pyh9f0ad1d_0    conda-forge
sqlite                    3.34.0               h74cdb3f_0    conda-forge
statsmodels               0.12.2           py38h5c078b8_0    conda-forge
threadpoolctl             2.1.0              pyh5ca1d4c_0    conda-forge
tk                        8.6.10               hed695b0_1    conda-forge
toml                      0.10.2             pyhd8ed1ab_0    conda-forge
tornado                   6.1              py38h497a2fe_1    conda-forge
tqdm                      4.56.2             pyhd8ed1ab_0    conda-forge
uncertainties             3.1.5              pyhd8ed1ab_0    conda-forge
wheel                     0.36.2             pyhd3deb0d_0    conda-forge
xz                        5.2.5                h516909a_1    conda-forge
zlib                      1.2.11            h516909a_1010    conda-forge
zstd                      1.4.8                ha95c52a_1    conda-forge
@MrOlm
Copy link
Owner

MrOlm commented Mar 22, 2021

Hi Nick,

Thanks for the detailed bug report. What's the command that was run that made this error? If you're using and .stb file and comparing lots of genomes, using --database_mode with inStrain compare will make sure that only genomes that are present in a sample are included in the clustering.

-Matt

@nick-youngblut
Copy link
Author

What's the command that was run that made this error?

inStrain compare --min_cov 5 --min_freq 0.05 --ani_threshold 0.99999       -p 8 -s $GENOME_stb             --genome $GENOME_fasta             -o $OUTDIR -i $INDIRS

will make sure that only genomes that are present in a sample are included in the clustering

That would require more work than just asking for forgiveness when doing it wrong. Moreover, what constitutes "present", given the possibility of false positives/negatives when assessing whether a MAG is present in a sample via coverage.


What about editing the code so that cluster_genome_strains() returns None if there's a value error:

return pd.concat(cdbs).reset_index(drop=True)

to:

try:
    df = pd.concat(cdbs).reset_index(drop=True)
except ValueError:
    df = None
return df

So if cluster_genome_strains() returns None, instrain compare can either generate no output or blank output.

@MrOlm
Copy link
Owner

MrOlm commented Mar 22, 2021

Sure- I'll implement this, make sure it passes tests, and let you know when up and running

@MrOlm
Copy link
Owner

MrOlm commented Mar 26, 2021

Implemented in v1.5.3

@MrOlm MrOlm closed this as completed Mar 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants