Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HCS specification #75

Closed
wants to merge 8 commits into from
Closed

HCS specification #75

wants to merge 8 commits into from

Conversation

sbesson
Copy link
Member

@sbesson sbesson commented Oct 28, 2020

Closes #73. Closes https://github.com/ome/omero-ms-zarr/issues/76

This PR captures the successive iterations for the first draft of the HCS specification

2020-10-29

The first set of changes capture the HCS Zarr layout which which was demonstrated as part of the 2020-10-29 community call.

Specification changes match the layout described in https://github.com/ome/omero-ms-zarr/issues/73#issuecomment-717206122 and include:

  • the addition of the standard RFC2119 block
  • the addition of the hierarchy description
  • the addition of the plate metadata

Implementations of the 2020-10-29 version of the HCS specification include:

Datasets created according to the 2020-10-29 version of the HCS specification on the https://s3.embassy.ebi.ac.uk endpoint URL

  • s3://idr/share/community-call-2020-10-29/idr0002-heriche-condensation/plate1_1_013/422.zarr - a 96 wells plate with timelapse images
  • s3://idr/share/community-call-2020-10-29/idr0002-heriche-condensation/plate1_1_013/422_no_T.zarr - a derived form of the above containing only the first timepoint of each field of view
  • s3://idr/share/community-call-2020-10-29/idr0033-rohban-pathways/41744_illum_corrected/5966.zarr - a 384 wells plate with 9 fields of views per well containing 5-channel images
  • s3://idr/share/community-call-2020-10-29/idr0004-thorpe-rad52/1751.zarr - a 69 wells sparse plate with multi-Z images

2020-11-09

Following, the 2020-10-29 community call, a few changes were made

Implementations of the 2020-11-09 version of the HCS specification include:

  • omero-cli-zarr TBD
  • ome-zarr TBD

Datasets created according to the 2020-11-29 version of the HCS specification:

  • TBD

- the fifth group defined all the individual fields of views for a given well.
The fields of views are images, SHOULD implement the "multiscales"
specification, MAY implement the "omero" specification and MAY contain
labels.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect we'll need to extend this syntax to elsewhere in the spec. For example, currently we have: "Information specific to the channels of an image and how to render it can be found under the "omero" key in the group-level metadata" suggests that an Image MUST have "omero" specification.
Probably, if this was improved then we don't need to specify the image metadata requirements here (under HCS) and under Images, but can refer from here to the Images spec (unless the Image spec requirements are different when under HCS? Probably not).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also noticed that currently the Images spec doesn't define whether the image contains Labels? e.g. in s3 there would be no way to know whether an image has labels.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There would be no way to find the labels that aren't in the labels/ group, but those are discoverable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I introduced RFC2119 as I suspect it might be useful to start thinking about these concepts both at the level of the individual keys but also at the level of the various specifications. If we agree, there will certainly be a work of reviewing the spec to introduce this terminology but a lot of this will be outside the scope of this PR.

While this terminology naturally applies to the definition of the individual keys applies naturally, I found it harder to use for the scope of each specification though. Taking your example, I don't think you MUST have an omero specification, but if you want some rendering metadata, you definitely SHOULD.

I also noticed that currently the Images spec doesn't define whether the image contains Labels? e.g. in s3 there would be no way to know whether an image has labels.

From the current spec, my understanding is that an image contains labels if the image group contains a sub-group implementing the labels specification?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that the discoverability of labels based on the existence of a sub-directory is different from the existence of e.g. Wells, Columns, Multiscale levels etc which are explicitly defined in the parent containers. To discover labels, you'd have to be aware of the spec to know to check, but for other sub-directories, you could find these by inspecting the parent metadata, without knowing the spec beforehand, which seems like a nice feature.

spec.md Outdated Show resolved Hide resolved
Copy link
Member

@joshmoore joshmoore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few initial comments around must/should on the layout.

spec.md Outdated Show resolved Hide resolved
dataset. There are exactly four levels of hierarchies above the images:

- the top-level group defines a single plate and MUST implement the plate
specification defined below
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't disagree with this in general, but we might need to work on the wording. You could have a zgroup which contains 2 plates, but then the plate zgroup becomes the "top-level" that we're discussing here.

- the top-level group defines a single plate and MUST implement the plate
specification defined below
- the second group defines all acquisitions performed on the plate. If
only one acquisition was performed, a single group must be used.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See ongoing slack discussion. If this is a "must" then this conflicts with the "should" in the first statement, no?

@joshmoore
Copy link
Member

Migrated to: ome/ngff#5

Note there are a few open conversations here.

@joshmoore joshmoore closed this Nov 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

HCS Well metadata HCS group layout
3 participants