-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
On disk versioning feedback #149
Comments
Thanks for initiating the discussion. I agree on the versioning of the elements separately and on having the software version in the attributes. I am not familiar with the rest so I won't comment it. |
An update on this:
The only open point is the following:
|
Now also the last point is implemented (in #711, which when merged will close this issue). I currently don't write this for reach element (so I don't have to bump the version number for
|
@giovp
Two main points of feedback on how versioning is done for the on-disk format. One minor one.
Version the elements separately
I think it makes a lot of sense to version the elements separately. For example, I can see wanting to change how points are encoded, but only that. I don't think we need to bump the version number for tables, shapes, images, just because points changed.
Use something other than
@type
I think
@type
is a weird option for reserved keyword since it's not how either anndata or OME specifies the type of an element. How we can do "metadata conventions" in a consistent way has been discussed around zarr quite a few times (e.g. ZEP 4). I'm not completley settled on what should be done, but can suggest some alternatives.I'll try and figure out which option I like the most ASAP.
Use anndata's
encoding_type
,encoding_version
We could just use anndata's
encoding_type
,encoding_version
.Advantages
Main disadvantage
The spatial data table is a superset of an AnnData. So if we're now using the same key, what's the type? If it's
anndata
, the attributes won't be part of the spec. If it'sspatialdata-table
orngff:regions_table
it's overwriting that it's also ananndata
.That said, the anndata library could just register
ngff:regions_table
to be read in as an AnnData. This library could even do that registration with:But that's only if you DON'T want to do something custom at read time, e.g. anything with the other attributes.
Use some separate field
This could be something like
spatialdata-encoding-type
. I personally would like this over@type
, since@type
is a kinda taken convention, which nothing else here isUse some separate name-spaced field
I think long term it would make sense to namespace these things. For example:
Where spatialdata knows that it's a superset of "anndata", and to check for
attrs["encoding"]["anndata"]
.While this would eventually require coordination, one could start doing it here. Would probably require more thought as I think this will actually be an issue for arrow tables.
Minor point: also write the software version into the attributes
Something that mudata does (like this), which could be quite helpful for debugging, is to write the library and version into metadata.
It would also be nice for devs to see that one weird file is weird because it was written by a development version.
The text was updated successfully, but these errors were encountered: