Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[python] Support default-coord reads of corner-written dense ND arrays #1810

Merged
merged 3 commits into from
Oct 20, 2023

Conversation

johnkerl
Copy link
Member

@johnkerl johnkerl commented Oct 19, 2023

Issue and/or context: In support of PR #1768 with issue #1769.

Needles to thread for update_uns:

The existing tiledbsoma.io.update_obs, tiledbsoma.io.update_var, tiledbsoma.io.add_X_layer methods all take as argument a tiledbsoma.Experiment which is already open for writing. Goal: do the same for tiledbsoma.io.update_uns.

Common use-case:

  • Ingest data from AnnData to SOMA
  • Outgest to AnnData; run ScanPy, curation, etc
  • Write the updates back to SOMA

Technical factors:

  • Whereas all the X, obsm, varm, obsp, varp are SparseNDArray, uns can contain DenseNDArray which are ingested from numpy.ndarray
  • If the after-uns has numpy.ndarray that the before-uns didn't (new data on update), there is no problem
  • If the after-uns has numpy.ndarray of the same dimensions as the before-uns, there is no problem
  • Schema evolution does not support modification of dense domains
  • We can try delete-and-re-add:
    • tiledb.Array.delete_array of the array accompanied by tiledb-group operations to delete the key/URI pair and re-add the key/URI pair
    • Note that for local-disk and S3, the new URI will be the same as the old URI; for cloud the new and old URIs will always be different URIs
    • Problem with that: we cannot delete-and-re-add at the same timestamp
    • If we do del collection[key] and collection[key] = new_dense_nd_array then we get tiledbsoma._exception.SOMAError: replacing key 'a' is unsupported which is a check that must remain (see links below)
  • Also we cannot do: del collection[key], close the collection, and re-open at the same timestamp because of
  • We could have update_uns do two opens-for-write: one to remove any shape-changing dense arrays, and one shifted by a millisecond to do the rest of the work
    • However this would mean tiledbsoma.io.update_uns would take a URI as first argument, different from all the other updaters in tiledbsoma.io which take an Experiment opened for write as first argument

Good feedback from @isaiah and @seth: simply have tiledbsoma.io's uns-ingestion logic write bigger-than-necessary arrays for dense ND arrays -- as we already do for sparse.

This is compliant with the SOMA specification.

Changes: As above.

Notes for Reviewer:

@johnkerl johnkerl requested review from nguyenv and ihnorton October 19, 2023 21:35
@johnkerl johnkerl force-pushed the kerl/dnda-expandable branch from 04ae50f to b7874fa Compare October 19, 2023 21:38
@codecov-commenter
Copy link

codecov-commenter commented Oct 19, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Files Coverage Δ
apis/python/src/tiledbsoma/_dense_nd_array.py 95.83% <100.00%> (+0.37%) ⬆️
apis/python/src/tiledbsoma/_sparse_nd_array.py 92.25% <ø> (-0.15%) ⬇️
apis/python/src/tiledbsoma/_tdb_handles.py 96.27% <100.00%> (+0.20%) ⬆️
apis/python/src/tiledbsoma/_tiledb_array.py 84.76% <100.00%> (+1.26%) ⬆️

... and 87 files with indirect coverage changes

📢 Thoughts on this report? Let us know!.

@johnkerl johnkerl requested a review from nguyenv October 19, 2023 22:05
Copy link
Member

@nguyenv nguyenv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@johnkerl johnkerl merged commit bbe7d68 into main Oct 20, 2023
10 checks passed
@johnkerl johnkerl deleted the kerl/dnda-expandable branch October 20, 2023 00:39
github-actions bot pushed a commit that referenced this pull request Oct 20, 2023
#1810)

* [python] Support default-coord reads of corner-written dense ND arrays

* code-review feedback

* remove one open
johnkerl added a commit that referenced this pull request Oct 20, 2023
#1810) (#1812)

* [python] Support default-coord reads of corner-written dense ND arrays

* code-review feedback

* remove one open

Co-authored-by: John Kerl <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants