-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[python] Ingestion performance #2434
Conversation
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #2434 +/- ##
===========================================
- Coverage 90.61% 79.93% -10.69%
===========================================
Files 37 92 +55
Lines 3900 8626 +4726
===========================================
+ Hits 3534 6895 +3361
- Misses 366 1731 +1365
Flags with carried forward coverage won't be shown. Click here to find out more.
|
soma_arr = DenseNDArray.create( | ||
arr_uri, | ||
type=pa_dtype, | ||
shape=value.shape, | ||
platform_config=platform_config, | ||
context=context, | ||
) | ||
except AlreadyExistsError: | ||
soma_arr = _factory.open(arr_uri, "w", soma_type=DenseNDArray, context=context) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can actually just directly call DenseNDArray.open
here (and same with DataFrame.open
above).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good, just one additional q re: a XXX
comment
@nguyenv @ryan-williams ready for re-review! :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks goods to me. Just have one minor suggestion.
Co-authored-by: nguyenv <[email protected]>
* AlreadyExistsError; use for DataFrame in tiledbsoma.io * apply in `_create_or_open_collection` * apply in _create_from_matrix * apply in _ingest_uns_ndarray * lint * Run SO copying workflow on macos-13 to avoid SIP (#2435) * AlreadyExistsError; use for DataFrame in tiledbsoma.io * apply in `_create_or_open_collection` * apply in _create_from_matrix * apply in _ingest_uns_ndarray * lint * neaten * neaten * Update raises-notes * code-review feedback Co-authored-by: nguyenv <[email protected]> --------- Co-authored-by: John Blischak <[email protected]> Co-authored-by: nguyenv <[email protected]>
* AlreadyExistsError; use for DataFrame in tiledbsoma.io * apply in `_create_or_open_collection` * apply in _create_from_matrix * apply in _ingest_uns_ndarray * lint * Run SO copying workflow on macos-13 to avoid SIP (#2435) * AlreadyExistsError; use for DataFrame in tiledbsoma.io * apply in `_create_or_open_collection` * apply in _create_from_matrix * apply in _ingest_uns_ndarray * lint * neaten * neaten * Update raises-notes * code-review feedback --------- Co-authored-by: John Kerl <[email protected]> Co-authored-by: John Blischak <[email protected]> Co-authored-by: nguyenv <[email protected]>
Issue and/or context: #2433
Changes: As narrated on #2433. The only additional details is that to balance the
DoesNotExistError
foropen
, we need a completely analogousAlreadyExistsError
forcreate
.Notes for Reviewer: There is a bit more performance improvement to be made, but it needs to be done in core; I will track that work separately. Also, as this is for Python; I'll track the R work -- both a needs-analysis, and any possible (maybe no) dev work -- separately.