Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A few misc questions #55

Closed
toloudis opened this issue Sep 2, 2021 · 3 comments
Closed

A few misc questions #55

toloudis opened this issue Sep 2, 2021 · 3 comments

Comments

@toloudis
Copy link

toloudis commented Sep 2, 2021

Let me know if this is the wrong forum and I can move this post.
We are considering making a big move to use ome-zarr. I have some miscellaneous questions/issues on the state of things.

  1. Context: We have lots of on-prem storage but need to move all of it to cloud. We will then need to make large images still accessible to compute that is more distant from the data. A possible expected scenario is
    a. microscope-->proprietary file format
    b. upload and immediate conversion to open format using aicsimageio
    c. scientists do compute and vis on chunked remote open format using aicsimageio

  2. Is it possible to store multiscale zarr groups on different storage categories? For example can we say we want the full resolution level on cold storage but downsampled levels on cloudfront/a more "hot" service?

  3. Is there an assumption on ome-ngff that multiscale resolutions are necessarily halved in x,y at each level? Or can I write any downsampling I want at each level (I have some calculations that forces it to fit in a certain memory footprint, for example). If so, key question: how do I get the data shape at each level?

  4. The current ome-ngff document here https://ngff.openmicroscopy.org/latest/#omero-md refers me to https://docs.openmicroscopy.org/omero/5.6.1/developers/Web/WebGateway.html#imgdata. Does that mean the spec is really the full omero spec contained at the latter link? That latter spec provides for physical pixel dimensions and shape information in top level metadata but it is not shown in the example in the ngff doc page.

  5. We capture a lot of large "multi-scene" files (the dreaded 6th dimension). Let's assume they are not separate wells. In ome-zarr, are we supposed to put them in separate root-level groups in the same store? Does ome provide some recommendation for this apart from just treating them as "different" images?

@will-moore
Copy link
Member

I can answer some points...

  1. No, the spec doesn't assume that levels are halved in size. As long as "The paths MUST be ordered from largest (i.e. highest resolution) to smallest.
  2. No, the spec is currently just what's on the ngff page. We just reused that omero section for convenience to get up and running quickly, but rendering info is likely to evolve based on community discussions.
  3. In the spec, multiscales is a list with a name for each. Would that work? However, I think most viewers will currently just show the multiscales[0]. There is an ongoing discussion on a "Collections" spec Collections Specification #31 for how to group images. Or if you want to store affine transformations between "scenes", see Transformation Specification #28

Sorry, don't know about storing different resolution levels on different storage media, but I guess if you can map different file paths to different storage then this should be possible??

@toloudis
Copy link
Author

toloudis commented Sep 2, 2021

  1. Regarding different storage media, we know we can do this type of logical mapping with AWS but the question would be how much we can guide an "ome-zarr writer" api to do it for us (this level goes here, and that level goes there) as opposed to building something from scratch. Maybe this is more of a low level zarr question.

  2. We have also discussed storing different projections as multiscales. I.e. we will want to have downsampled volume data, but then might also want to store a "middle slice" thumbnail. So maybe the spec is not general enough for that case. It is also incredibly convenient to know something about the data dimensions and type for each of the multiscales as early as possible (in json metadata), to allow a viewer to decide intelligently how much it should load.

  3. When I tried to implement this in my viewer, it was absolutely necessary to have "physical pixel size" in some form, which is missing from the spec. Additionally, I found that stashing the intensity max and min in window.max and window.min were necessary to avoid an extra traversal of the data in some cases.

@joshmoore
Copy link
Member

g'morning, @toloudis.

Some additions to @will-moore's thoughts below, but generally 👍 for the questions (and future input on the specs).

2.

Is it possible to store multiscale zarr groups on different storage categories?

In the current spec, no. See #13 for the work to enable it.

the question would be how much we can guide an "ome-zarr writer" api to do it for us
After 5 seconds of pondering, I could see having something like a "remote-array" which you pass in. write_pyramid([cold_array, warm_array, hot_array]).

Maybe this is more of a low level zarr question.

Maybe but I've not seen a proposal or discussion on it to date. (The closest might be fsspec-reference-maker)

3.

No, the spec doesn't assume that levels are halved in size.

Additional work will come on this with the translations (likely v0.4)

might also want to store a "middle slice" thumbnail

hmmm... I wonder if the rendering metadata might not be a place to point to this.

each of the multiscales as early as possible (in json metadata),

hmmm.... a bit hesitant to pull this out of the zarr json metadata and duplicate it in the main block. I wonder if "consolidated_metadata" gets you what you need. (If not, happy to see a proposal)

@ome ome locked and limited conversation to collaborators Sep 3, 2021

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants