-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify plate and well specifications for sparse plates #24
Conversation
Looks good. 👍 |
The suggested changes also read fine to me and are inline with the decisions made for the first version of the HCS specification. From my side, this commit could also be ported directly to the As discussed recently, as we start applying the OME-Zarr HCS specification to more real-world HCS use cases especially sparse plates, we might need to review and reconsider how we handle these the specification. This can be captured and discussed as a separate issue. |
0c28690 expands on the sparse plate handling to explicitly identify the row and column for each well. glencoesoftware/bioformats2raw#91 is a corresponding proposed implementation. Both are based on discussion with @kkoz and @chris-allan. In the sparse plate example where only C5 and D7 are acquired, a human reading the JSON can clearly see that Happy to split 0c28690 into a separate issue if that's easier to discuss. |
There is a proposal to simplify the specifications of "collections" #31. |
latest/index.bs
Outdated
<dt><strong>version</strong></dt> | ||
<dd>A string defining the version of the specification.</dd> | ||
<dt><strong>wells</strong></dt> | ||
<dd>A list of JSON objects defining the wells of the plate. Each well object | ||
MUST contain a `path` key identifying the path to the well subgroup.</dd> | ||
MUST contain a `path` key identifying the path to the well subgroup. | ||
Each well object MUST contain both a `row_index` key identifying the index into |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding #24 (comment), are there cases where it is not possible to recompute these indexes based on the knowledge of the individual wells path
as well as the rows
and names
dictionaries? If recomputing is always possible (but at the cost of the consumer), my primary consideration is whether the recommendation for these new fields should be SHOULD
rather than MUST
.
For real-world examples, I can definitely see how row_index/column_index
makes sense in terms of optimizing some of the queries. In addition to testing this with sparse plates, it will be useful to also generate representative plate with many wells (384 at least) to check there is no performance impact with the extra JSON metadata.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order for these indexes to be forward or reverse computable, path
would need to be much more explicitly defined than it is now:
A list of JSON objects defining the fields of views for a given well. Each object MUST contain a path key identifying the path to the field of view. If multiple acquisitions were performed in the plate, it SHOULD contain an acquisition key identifying the id of the acquisition which must match one of acquisition JSON objects defined in the plate metadata.
Furthermore, the wells
array would need have be null or similar padding in order for those indexes to make sense.
Neither of these things are ideal obviously. I don't think there's a way to not have these things be MUST
if we want to guarantee that lookups can happen based on physical plate characteristics.
5a1ddc7 is based on glencoesoftware/bioformats2raw#119 and discussion with @chris-allan earlier today, in preparation for discussion with @sbesson tomorrow. The proposed changes around well path in particular are still up for debate. |
latest/index.bs
Outdated
additional leading or trailing directories. | ||
Each well object MUST contain both a `row_index` key identifying the index into | ||
the `rows` list and a `column_index` key indentifying the index into | ||
the `columns` list. `row_index` and `column_index` MUST be 0-based.</dd> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realise that #70 has been added after this PR was opened, but the decision there means these new attributes should now be named rowIndex
and columnIndex
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be fixed in 7c2536a.
Following discussion with @sbesson and @chris-allan, 3c31c14 relaxes the "no empty groups" statement to address #24 (comment). There are also some clarifications to the row and column naming, intended to be consistent with https://www.openmicroscopy.org/Schemas/Documentation/Generated/OME-2016-06/ome_xsd.html#NamingConvention. |
OverviewThe specification changes proposed in this PR closely reflect several choices made in
RFC and Community callThis PR has now passed several rounds of internal review and is reaching the state where community feedback would be useful before integrating formally in an upcoming version of the specification. Given the latest announcement of the next NGFF call, I would propose to set the week of 2022-01-24 as the deadline for public comments. Ideally, we can review the state of this proposal, reach an agreement and decide on the timeline for getting these changes released as part of this community call. Specific commentsEmpty Zarr groupsA former version of the proposal forbade the existence of Zarr groups for wells and well rows containing no images. Following the feedback from #24 (comment), the latest version now reduces this as a recommendation. I can think of rationales backing both specification. Importantly, the biggest decision factor might be at the level of the consumer library:
Rows/columns namesThe new requirements regarding the content of the Constraints have been added to the Wells indicesA major change in this proposal is that each The examples in the specification page as well as the As discussed in #24 (comment), an alternate proposal would be to force a mapping between the path to the Zarr group of the well and the names of the row and column associated with this well. Under such a proposal, it would become superfluous to require both the
Well metadataAt the well group level, for multi-acquisition plates, the Samples and implementationsThe specification includes a few examples of metadata for sparse HCS data that complements the existing examples of dense plates. As for every release, representative real-world HCS examples should be generated covering as many features as possible. In terms of implementation, glencoesoftware/bioformats2raw#119 contains the implementation of these changes for |
Another comment while looking at validation this morning is that the the current specification does not define the level of requirement for the keys under The initial JSON schema introduced in https://github.com/ome/ngff/pull/76/files#diff-2e387106f2f394aca19236f21c170f70b94c7931f60bfc2d8f6549941105e0cfR105 defines For |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After reviewing the discussions from the 6th OME-NGFF community call, no objection/amendment was made for this proposal. The various HCS aware OME-Zarr implementations have been updated to support the new proposed layout. Merging in preparation of the upcoming 0.4 specification announcement.
This pull request has been mentioned on Image.sc Forum. There might be relevant details there: https://forum.image.sc/t/next-call-on-next-gen-bioimaging-data-tools-2022-01-27/60885/11 |
Starting point for discussion. The main scenarios to clarify are plates that are missing entire row(s)/column(s), and wells with a field in some (but not all) of the defined acquisitions.