-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Choose smallest np.dtype that handles labels data #116
Conversation
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
src/omero_zarr/masks.py
Outdated
for count, shapes in enumerate(masks): | ||
for shape in shapes: | ||
# Using ROI ID allows stitching label from multiple images | ||
# into a Plate and not creating duplicates from different iamges. | ||
# All shapes will be the same value (color) for each ROI | ||
shape_value = shape.roi.id.val | ||
shape_value = shape.roi.id.val - min_roi_id + 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implementation will certainly works for the bulk of the use cases. Depending on how the saving of the ROIs, the main downside is that it might create gaps in the label values. There is no requirement for the label values must be consecutive but since we are reindexing them, this is slightly confusing.
Could we create a sorted list of all ROI IDs and use sorted_roi_ids.index(shape.roi.id.val)
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in cff3363
Co-authored-by: Josh Moore <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested image + mask export on a few samples images of different modalities and number of ROIS and uploaded them to a temporary bucket for visualization:
Overall the logic works well and the label images are using a more adapted data type which allowing visualization.
Two outstanding questions:
-
I selected the examples above to have a range of images with different number of ROIs spannign the data type but looking at the metadata, all label images seem to be using
int16
(conversion) [sbesson@pilot-zarr1-dev omero-cli-zarr_116]$ grep dtype 6001247.zarr/labels/0/0/.zarray "dtype": "<i2", (conversion) [sbesson@pilot-zarr1-dev omero-cli-zarr_116]$ grep dtype 4496763.zarr/labels/0/0/.zarray "dtype": "<i2", (conversion) [sbesson@pilot-zarr1-dev omero-cli-zarr_116]$ grep dtype 10501752.zarr/labels/0/0/.zarray "dtype": "<i2",
Since
6001247.zarr
only has 48 label values, I would have expected<i1
? -
the code is currently saving the label array as
int64
using the combination of masks and label values and then converting into the expecteddtype
. I have not computed the performance impact of the casting but would it not be preferable to instantiate the mask using the final data type using the knowledge of the maximum label value?
Thanks @sbesson. Both those issues should be fixed now.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @will-moore, regenerated the three images + labels above with the latest version of the code
Also generated an example with two labels
- idr0052 (image / Cell label / Chromosomes label)
The dtype
are now int8
or int16
depending on the number of label values as expected.
(conversion) [sbesson@pilot-zarr1-dev omero-cli-zarr_116]$ grep dtype 6001247.zarr/labels/0/0/.zarray
"dtype": "|i1",
(conversion) [sbesson@pilot-zarr1-dev omero-cli-zarr_116]$ grep dtype 4496763.zarr/labels/0/0/.zarray
"dtype": "<i2",
(conversion) [sbesson@pilot-zarr1-dev omero-cli-zarr_116]$ grep dtype 10501752.zarr/labels/0/0/.zarray
"dtype": "<i2",
(conversion) [sbesson@pilot-zarr1-dev omero-cli-zarr_116]$ grep dtype 5514375.zarr/labels/Cell/0/.zarray
"dtype": "|i1",
(conversion) [sbesson@pilot-zarr1-dev omero-cli-zarr_116]$ grep dtype 5514375.zarr/labels/Chromosomes/0/.zarray
"dtype": "|i1",
Anything else to review or should we go ahead with an omero-cli-zarr
patch release with the mask dtype fix?
Also should we move the samples above to the idr
bucket and register them in idr.github.io/ome-ngff-samples/? And are there other sample additions/updates we would like to make?
#107 would be nice to get it. Not tested yet, but I just tried it on top of this branch and it's looking good. |
Make complete sense since it also brings improvments to the label image writing. I'll put in on my review list and make use of the samples I generated above as the ground truth |
Fixes #115
This PR
np.dtype
that is required for the maximum mask value.To test: export labels and check dtype (was previously
"<i8"
):Napari should still show masks correctly, with omero IDs shown when mousing over shapes:
$ napari --plugin napari-ome-zarr 6001240.zarr
Mousing over label at bottom-left:
Check shape ID:
https://idr.openmicroscopy.org/iviewer/?shape=1664664