Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

uns value with zero length value raises exception in from_anndata #71

Closed
Tracked by #113
bkmartinjr opened this issue May 11, 2022 · 4 comments
Closed
Tracked by #113
Assignees
Labels
blocked bug Something isn't working

Comments

@bkmartinjr
Copy link
Member

Calling from_anndata raises when saving a zero-length string in uns.

In [5]: ad
Out[5]: 
AnnData object with n_obs × n_vars = 16245 × 32812
    obs: 'nCount_RNA', 'nFeature_RNA', 'nCount_SCT', 'nFeature_SCT', 'tissue_ontology_term_id', 'assay_ontology_term_id', 'disease_ontology_term_id', 'cell_type_ontology_term_id', 'author_cell_type', 'development_stage_ontology_term_id', 'ethnicity_ontology_term_id', 'sex_ontology_term_id', 'is_primary_data', 'organism_ontology_term_id', 'cell_type', 'assay', 'disease', 'organism', 'sex', 'tissue', 'ethnicity', 'development_stage'
    var: 'feature_biotype', 'feature_is_filtered', 'feature_name', 'feature_reference'
    uns: 'X_normalization', 'genome_annotation_version', 'layer_descriptions', 'preprint_doi', 'publication_doi', 'schema_version', 'title'
    obsm: 'X_pca', 'X_tsne'

In [6]: ad.uns['preprint_doi']
Out[6]: ''

And the stack crawl.

AnnData loaded: AnnData object with n_obs × n_vars = 16245 × 32812
    obs: 'nCount_RNA', 'nFeature_RNA', 'nCount_SCT', 'nFeature_SCT', 'tissue_ontology_term_id', 'assay_ontology_term_id', 'disease_ontology_term_id', 'cell_type_ontology_term_id', 'author_cell_type', 'development_stage_ontology_term_id'
, 'ethnicity_ontology_term_id', 'sex_ontology_term_id', 'is_primary_data', 'organism_ontology_term_id', 'cell_type', 'assay', 'disease', 'organism', 'sex', 'tissue', 'ethnicity', 'development_stage'
    var: 'feature_biotype', 'feature_is_filtered', 'feature_name', 'feature_reference'
    uns: 'X_normalization', 'genome_annotation_version', 'layer_descriptions', 'preprint_doi', 'publication_doi', 'schema_version', 'title'
    obsm: 'X_pca', 'X_tsne'
START  SOMA.from_ann
  START  DECATEGORICALIZING
  FINISH DECATEGORICALIZING TIME 0.053
  START  WRITING /home/bruce/corpora-data-prod/45f3ac43-6466-4aae-886f-388d9a9014b1/local.soma
    START  WRITING /home/bruce/corpora-data-prod/45f3ac43-6466-4aae-886f-388d9a9014b1/local.soma/X/data from <class 'scipy.sparse._csr.csr_matrix'>
      START  __ingest_coo_data_string_dims_rows_chunked
        START  chunk rows 0..9382 of 16245, obs_ids Cell_1..Cell_3821, nnz=10000976,  57.753%
        FINISH chunk TIME 12.674
        START  chunk rows 9383..16245 of 16245, obs_ids Cell_3822..Cell_9999, nnz=7756069, 100.000%
        FINISH chunk TIME 9.944
      FINISH __ingest_coo_data_string_dims_rows_chunked TIME 23.527
    FINISH WRITING /home/bruce/corpora-data-prod/45f3ac43-6466-4aae-886f-388d9a9014b1/local.soma/X/data TIME 23.570
    START  WRITING /home/bruce/corpora-data-prod/45f3ac43-6466-4aae-886f-388d9a9014b1/local.soma/obs
    FINISH WRITING /home/bruce/corpora-data-prod/45f3ac43-6466-4aae-886f-388d9a9014b1/local.soma/obs TIME 0.160
    START  WRITING /home/bruce/corpora-data-prod/45f3ac43-6466-4aae-886f-388d9a9014b1/local.soma/var
    FINISH WRITING /home/bruce/corpora-data-prod/45f3ac43-6466-4aae-886f-388d9a9014b1/local.soma/var TIME 0.075
    START  WRITING /home/bruce/corpora-data-prod/45f3ac43-6466-4aae-886f-388d9a9014b1/local.soma/obsm/X_pca
    Annotation matrix obsm/X_pca has shape (16245, 50)
    FINISH WRITING /home/bruce/corpora-data-prod/45f3ac43-6466-4aae-886f-388d9a9014b1/local.soma/obsm/X_pca TIME 0.325
    START  WRITING /home/bruce/corpora-data-prod/45f3ac43-6466-4aae-886f-388d9a9014b1/local.soma/obsm/X_tsne
    Annotation matrix obsm/X_tsne has shape (16245, 2)
    FINISH WRITING /home/bruce/corpora-data-prod/45f3ac43-6466-4aae-886f-388d9a9014b1/local.soma/obsm/X_tsne TIME 0.177
    START  WRITING /home/bruce/corpora-data-prod/45f3ac43-6466-4aae-886f-388d9a9014b1/local.soma/raw
    START  WRITING /home/bruce/corpora-data-prod/45f3ac43-6466-4aae-886f-388d9a9014b1/local.soma/raw/X/data from <class 'scipy.sparse._csr.csr_matrix'>
      START  __ingest_coo_data_string_dims_rows_chunked
        START  chunk rows 0..9382 of 16245, obs_ids Cell_1..Cell_3821, nnz=10000976,  57.753%
        FINISH chunk TIME 12.246
        START  chunk rows 9383..16245 of 16245, obs_ids Cell_3822..Cell_9999, nnz=7756069, 100.000%
        FINISH chunk TIME 9.953
      FINISH __ingest_coo_data_string_dims_rows_chunked TIME 23.113
    FINISH WRITING /home/bruce/corpora-data-prod/45f3ac43-6466-4aae-886f-388d9a9014b1/local.soma/raw/X/data TIME 23.151
    START  WRITING /home/bruce/corpora-data-prod/45f3ac43-6466-4aae-886f-388d9a9014b1/local.soma/raw/var
    FINISH WRITING /home/bruce/corpora-data-prod/45f3ac43-6466-4aae-886f-388d9a9014b1/local.soma/raw/var TIME 0.074
    FINISH WRITING /home/bruce/corpora-data-prod/45f3ac43-6466-4aae-886f-388d9a9014b1/local.soma/raw TIME 23.232
    START  WRITING /home/bruce/corpora-data-prod/45f3ac43-6466-4aae-886f-388d9a9014b1/local.soma/uns
      START  WRITING NUMPY /home/bruce/corpora-data-prod/45f3ac43-6466-4aae-886f-388d9a9014b1/local.soma/uns/X_normalization
      FINISH WRITING NUMPY /home/bruce/corpora-data-prod/45f3ac43-6466-4aae-886f-388d9a9014b1/local.soma/uns/X_normalization TIME 0.021
      START  WRITING NUMPY /home/bruce/corpora-data-prod/45f3ac43-6466-4aae-886f-388d9a9014b1/local.soma/uns/genome_annotation_version
      FINISH WRITING NUMPY /home/bruce/corpora-data-prod/45f3ac43-6466-4aae-886f-388d9a9014b1/local.soma/uns/genome_annotation_version TIME 0.022
    START  WRITING /home/bruce/corpora-data-prod/45f3ac43-6466-4aae-886f-388d9a9014b1/local.soma/uns/layer_descriptions
      START  WRITING NUMPY /home/bruce/corpora-data-prod/45f3ac43-6466-4aae-886f-388d9a9014b1/local.soma/uns/layer_descriptions/X
      FINISH WRITING NUMPY /home/bruce/corpora-data-prod/45f3ac43-6466-4aae-886f-388d9a9014b1/local.soma/uns/layer_descriptions/X TIME 0.020
      START  WRITING NUMPY /home/bruce/corpora-data-prod/45f3ac43-6466-4aae-886f-388d9a9014b1/local.soma/uns/layer_descriptions/raw.X
      FINISH WRITING NUMPY /home/bruce/corpora-data-prod/45f3ac43-6466-4aae-886f-388d9a9014b1/local.soma/uns/layer_descriptions/raw.X TIME 0.020
    FINISH WRITING /home/bruce/corpora-data-prod/45f3ac43-6466-4aae-886f-388d9a9014b1/local.soma/uns/layer_descriptions TIME 0.044
      START  WRITING NUMPY /home/bruce/corpora-data-prod/45f3ac43-6466-4aae-886f-388d9a9014b1/local.soma/uns/preprint_doi
Traceback (most recent call last):
  File "/home/bruce/TileDB-SingleCell/apis/python/temp/ingest_sweep.py", line 118, in <module>
    sys.exit(main())
  File "/home/bruce/TileDB-SingleCell/apis/python/temp/ingest_sweep.py", line 43, in main
    sc.from_anndata(ad)
  File "/home/bruce/TileDB-SingleCell/apis/python/src/tiledbsc/soma.py", line 86, in from_anndata
    self.write_tiledb_group(anndata)
  File "/home/bruce/TileDB-SingleCell/apis/python/src/tiledbsc/soma.py", line 326, in write_tiledb_group
    self.write_uns_group(uns_group_uri, anndata.uns)
  File "/home/bruce/TileDB-SingleCell/apis/python/src/tiledbsc/soma.py", line 526, in write_uns_group
    util.numpyable_object_to_tiledb_array(value, component_uri, self.ctx)
  File "/home/bruce/TileDB-SingleCell/apis/python/src/tiledbsc/util.py", line 184, in numpyable_object_to_tiledb_array
    _write_numpy_ndarray_to_tiledb_array(arr, uri, ctx)
  File "/home/bruce/TileDB-SingleCell/apis/python/src/tiledbsc/util.py", line 197, in _write_numpy_ndarray_to_tiledb_array
    tiledb.from_numpy(uri=uri, array=arr, ctx=ctx)
  File "/home/bruce/TileDB-SingleCell/apis/python/venv/lib/python3.9/site-packages/tiledb/highlevel.py", line 91, in from_numpy
    return tiledb.DenseArray.from_numpy(uri, array, ctx=_get_ctx(ctx, config), **kwargs)
  File "tiledb/libtiledb.pyx", line 4157, in tiledb.libtiledb.DenseArrayImpl.from_numpy
  File "tiledb/libtiledb.pyx", line 4160, in tiledb.libtiledb.DenseArrayImpl.from_numpy
  File "tiledb/libtiledb.pyx", line 4478, in tiledb.libtiledb.DenseArrayImpl.__setitem__
  File "tiledb/libtiledb.pyx", line 4558, in tiledb.libtiledb.DenseArrayImpl._setitem_impl
  File "tiledb/libtiledb.pyx", line 483, in tiledb.libtiledb._write_array
  File "tiledb/libtiledb.pyx", line 526, in tiledb.libtiledb._raise_ctx_err
  File "tiledb/libtiledb.pyx", line 511, in tiledb.libtiledb._raise_tiledb_error
tiledb.cc.TileDBError: [TileDB::Filter] Error: FilterBuffer error; cannot init buffer: nullptr given.
@johnkerl johnkerl self-assigned this May 11, 2022
@johnkerl johnkerl added bug Something isn't working active blocked labels May 11, 2022
@johnkerl
Copy link
Member

TileDB-Py bug when all list elements are the empty string:

>>> tiledb.from_numpy('test1', numpy.asarray(['a','b','c'], dtype='O'))
DenseArray(uri='test1', mode=r, ndim=1)

>>> tiledb.from_numpy('test2', numpy.asarray(['a','b',''], dtype='O'))
DenseArray(uri='test2', mode=r, ndim=1)

>>> tiledb.from_numpy('test3', numpy.asarray(['a'], dtype='O'))
DenseArray(uri='test3', mode=r, ndim=1)

>>> tiledb.from_numpy('test4', numpy.asarray([''], dtype='O'))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/tiledb/highlevel.py", line 91, in from_numpy
    return tiledb.DenseArray.from_numpy(uri, array, ctx=_get_ctx(ctx, config), **kwargs)
  File "tiledb/libtiledb.pyx", line 4174, in tiledb.libtiledb.DenseArrayImpl.from_numpy
  File "tiledb/libtiledb.pyx", line 4177, in tiledb.libtiledb.DenseArrayImpl.from_numpy
  File "tiledb/libtiledb.pyx", line 4495, in tiledb.libtiledb.DenseArrayImpl.__setitem__
  File "tiledb/libtiledb.pyx", line 4575, in tiledb.libtiledb.DenseArrayImpl._setitem_impl
  File "tiledb/libtiledb.pyx", line 483, in tiledb.libtiledb._write_array
  File "tiledb/libtiledb.pyx", line 526, in tiledb.libtiledb._raise_ctx_err
  File "tiledb/libtiledb.pyx", line 511, in tiledb.libtiledb._raise_tiledb_error
tiledb.cc.TileDBError: [TileDB::Filter] Error: FilterBuffer error; cannot init buffer: nullptr given.

@johnkerl
Copy link
Member

Also calls for a new unit-test case within tiledbsc-py

@johnkerl
Copy link
Member

@johnkerl johnkerl mentioned this issue May 24, 2022
61 tasks
@johnkerl
Copy link
Member

johnkerl commented Jul 5, 2022

Resolved now that scalars are saved as uns metadata.

@johnkerl johnkerl closed this as completed Jul 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocked bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants