Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Best practice for storing timing info and pixel size for 5D data? #1

Open
dpshepherd opened this issue Dec 4, 2021 · 2 comments
Open
Labels
question Further information is requested

Comments

@dpshepherd
Copy link

Thanks for this useful repo! I have a question about best practices.

We have data that can be tczyx ordered. We generate one zyx volume, held in an ndarray, at a time by iterating over all channels and times stored in a raw data structure created by our custom microscope on the disk. Writing and reloading this data works, so that is very helpful.

As part of the OME-zarr, we'd like to store metatdata contain the time interval between each zyx volume and the zyx pixel sizes. It isn't totally clear to us how to append this metadata at the moment. Is there a best practices on how to go about this? I briefly looked through the tests and examples and did not see this. From the example, it is clear how to name the channels and add affine transformations.

Thanks!

@aeisenbarth
Copy link
Owner

Time and channel axes are different from spatial dimensions because they are not equidistant. This is still an open question in the development of the NGFF specification. It's best you check out the current NGFF 0.3 specification, the 0.4 RFC and the discussions. I'm not an expert on this, and we don't use time data ourselves. The scope of this repository was just to make for ourselves those features available that we need and that were missing from the spec. This is why there is nothing about time.

You could get involved into the discussions with your requirements. If you need to store time metadata before the discussions result in an official specification, you could make a private extension to the format (similar as in this repo) and modify a writer/reader library if necessary. Make sure that what you do won't clash with a future official spec, for example by underscoring JSON keys (I did _transformation instead of transformation) or using a private namespace (my_namespace:my_key). Also make sure that no third-parties depend on it or are aware that it is temporary and no official standard.

Regarding axes coordinates metadata, they are currently a bit fragmented: channel names can be stored in the Omero section, where as spatial axes have no explicit coordinates (well, that would be an integer list and most of the time not really necessary), and label properties (not an axes) have again a different format. I would see a unified representation fit best into axes, and coordinates could be integers, floats, strings (categorical, or ISO date):

  "axes": [
      {
          "name": "t", 
          "type": "time", 
+         "coords": ["2021-12-07T09:30:10.001", "2021-12-07T09:30:19.002", "2021-12-07T09:31:58.003"]
      },
      …
  ]

If you have no time stamps but only step sizes, I'd rather sum them up to distances relative to an arbitrary starting point (first time point = 0.0). For further metadata per coordinate, the question is whether to add a separate list for each or have a list of dictionaries. But these are only my thoughts, and an official specification could look very different.

@aeisenbarth aeisenbarth added the question Further information is requested label Dec 7, 2021
@dpshepherd
Copy link
Author

Thanks for the detailed answer! We are in touch with the OME-NGFF folks, but for a different project (spatial transcriptomics).

We'll play around and add some helpers functions ourselves. This is actually what we are already doing, but using metadata stored outside the Zarr. We may just stay that way for now until the OME-NGFF settles down.

Thanks again and very useful package!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants