-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SOMA-collection-level slice/batch queries [needs revision] #157
Conversation
fb77162
to
fc07e3c
Compare
91a78c9
to
9ebc9a4
Compare
]: | ||
soma_path = os.path.join(soco_path, name) | ||
soma = tiledbsc.SOMA(soma_path) | ||
tiledbsc.io.from_h5ad(soma, h5ad) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would argue that this is a non-pythonic way to call a function with such a name. Generally, I would expect that a from_*
function returns the created object, instead of mutating an argument. I understand that the object is disk-based so it makes sense to pre-create it and pass it as an argument, but then I'd probably rename the function to something that makes the side effect more explicit (like write_from_h5ad
). @bkmartinjr your opinion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if it is non-pythonic (I lack expertise to make that determination). But I do find it awkward and likely error-prone as the two step process (create the soma, write the soma) allows for cases that will fail (eg, the soma is modified before from_h5ad
is called).
I personally prefer a single function that both creates the soma and ingests the H5AD, eg, soma.io.from_h5ad(soma_path: str, h5ad_path: str, ...) -> soma
. Ie, this function would call the SOMA constructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good to know @bkmartinjr -- thanks for the feedback! I'm fine with this; only wish I'd known sooner :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well...I am unclear who is establishing the API and the review process for it. So I'm just giving you my opinions :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bkmartinjr @ebezzi I'll go ahead & make this change but on a separate PR -- this soma.io
business is old & predates this PR, and this change can be neatly factored out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it's worth posting this as a separate issue for discussion/feedback? The R API currently works this way too (ie, ingestion and SOMA instantiation are separate steps) and altering that behavior would require a lot of small updates to docs/tests/etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still working through it but I posted a couple of comments for now.
771d3f5
to
5470ff2
Compare
…SingleCell into kerl/soma-slice
After rebases this PR is completely out of control (> 200 commits including mostly merge commits). Closing, and will resubmit. |
re-opening long enough to squash |
failed squash :( |
#173 is the re-do |
* Re-do of #157 * remove now-duplicate file * bring an unaffected example file up to date with main * add var.feature_name to examples/collection_counts.py * mkmd.sh doc-gen * Allow obs_ids/var_ids and obs_query_string/var_query_string in soco/soma attribute_filter * mkmd.sh doc-gen * attribute_filter -> query * manual testing
…build-python311-0-1_h50996c Rebuild for python311
Overview
[cartographer](https://github.com/single-cell-data/TileDB-SingleCell/blob/kerl/soma- slice/apis/python/examples/cartographer.py), due originally to @bkmartinjr , is a portable/reusable example of how to take a group ofsplit out to Uniformization examples #161.h5ad
files and create aSOMACollection
having uniformobs
schemasplitter is a little tool that was used to create the four smallsplit out to Doc bits for splitting out little test files #167.h5ad
files for unit-test/example-documentation purposesWalkthrough
SOMA
and.h5ad
format, so you can go on to do all manner of analysis using for example Scanpy.X/data
, grouping byobs['cell_type_ontology_term_id']
.Notes
SOMASlice
object currently has a (well-isolated) runtime dependency onAnnData
for itsconcat
-- namely, it converts a list ofSOMASlice
to a list ofAnnData
, leveragesAnnData
'sconcat
, then converts the concatenatedAnnData
object back toSOMASlice
.Update
#173 is the re-do