OME-HCS Compatibility with vizarr #118

camFoltz · 2021-09-09T20:04:48Z

Hello!

I feel as though I should properly introduce myself as I have been active recently on ImageSC regarding ome-zarr data format.
I am an RA at the Chan-Zuckerberg Biohub working on optimizing our computational pipelines and we are now migrating towards the OME-zarr format for all of our raw/processed data. I would like to help you all optimize these viewers (vizarr + ome-zarr-py) as our team starts to use them heavily, so I may start to post some issues here more frequently.

One Issue that I am coming across right away, and it's due to the lack of insight into how the ome-zarr HCS metadata is parsed, has to do with inflexible key structures. I'll outline below:

Consider the OME-Zarr Dataset tree below, with the level above Fake_Row being Plate.zarr

/
 └── Fake_Row
     ├── Fake_Col_0
     │   └── Pos_000
     │       └── array (1, 6, 24, 2048, 2048) uint16
     ├── Fake_Col_1
     │   └── Pos_001
     │       └── array (1, 6, 24, 2048, 2048) uint16
     ├── Fake_Col_2
     │   └── Pos_002
     │       └── array (1, 6, 24, 2048, 2048) uint16
     └── Fake_Col_3
         └── Pos_003
             └── array (1, 6, 24, 2048, 2048) uint16

The metadata at the associated levels is as such:

"Plate Metadata" at Plate.zarr.attrs

{'plate': {'acquisitions': [{'id': 1,
    'maximumfieldcount': 1,
    'name': 'Dataset',
    'starttime': 0}],
  'columns': [{'name': 'Fake_Col_0'},
   {'name': 'Fake_Col_1'},
   {'name': 'Fake_Col_2'},
   {'name': 'Fake_Col_3'}],
  'field_count': 1,
  'name': 'test',
  'rows': [{'name': 'Fake_Row'}],
  'version': '0.1',
  'wells': [{'path': 'Fake_Row/Fake_Col_0'},
   {'path': 'Fake_Row/Fake_Col_1'},
   {'path': 'Fake_Row/Fake_Col_2'},
   {'path': 'Fake_Row/Fake_Col_3'}]}}

"Well Metadata" at Plate.zarr['Fake_Row']['Fake_Col_{i}'].attrs

{'well': {'images': [{'path': 'Pos_000'}], 'version': '0.1'}} # Fake_Col_0
{'well': {'images': [{'path': 'Pos_001'}], 'version': '0.1'}} # Fake_Col_1
{'well': {'images': [{'path': 'Pos_002'}], 'version': '0.1'}} # Fake_Col_2
{'well': {'images': [{'path': 'Pos_003'}], 'version': '0.1'}} # Fake_Col_3

"omero / multi-scales" metadata at Plate.zarr['Fake_Row']['Fake_Col_0']['Pos_000'].attrs

{'multiscales': [{'datasets': [{'path': 'array'}], 'version': '0.1'}],
 'omero': {'channels': [{'active': True,
    'coefficient': 1.0,
    'color': '808080',
    'family': 'linear',
    'inverted': False,
    'label': 'State0',
    'window': {'end': 1279, 'max': 65535, 'min': 0, 'start': 663}},
   {'active': True,
    'coefficient': 1.0,
    'color': '808080',
    'family': 'linear',
    'inverted': False,
    'label': 'State1',
    'window': {'end': 3718, 'max': 65535, 'min': 0, 'start': 1804}},
   {'active': True,
    'coefficient': 1.0,
    'color': '808080',
    'family': 'linear',
    'inverted': False,
    'label': 'State2',
    'window': {'end': 5128, 'max': 65535, 'min': 0, 'start': 2101}},
   {'active': True,
    'coefficient': 1.0,
    'color': '808080',
    'family': 'linear',
    'inverted': False,
    'label': 'State3',
    'window': {'end': 2595, 'max': 65535, 'min': 0, 'start': 1117}},
   {'active': True,
    'coefficient': 1.0,
    'color': '808080',
    'family': 'linear',
    'inverted': False,
    'label': 'Cy5 - 635~730',
    'window': {'end': 594, 'max': 65535, 'min': 0, 'start': 122}},
   {'active': True,
    'coefficient': 1.0,
    'color': '808080',
    'family': 'linear',
    'inverted': False,
    'label': 'FITC - 474~515',
    'window': {'end': 761, 'max': 65535, 'min': 0, 'start': 141}}],
  'rdefs': {'defaultT': 0,
   'defaultZ': 0,
   'model': 'color',
   'projection': 'normal'},
  'version': 0.1}}

This structure makes the most sense for our data, as we like to keep track of the position index in Pos_000, etc as we move through rows and columns. However I am running into the issue while using vizarr such that if this indexing doesn't start back at 0 (Pos_000) beneath every "well" then the viewer raises this error:

I am really not sure why it is searching for a path (Fake_Row/Fake_Col_1/Pos_000) that isn't referenced in the metadata (or maybe it is and I missed it?). However when I rename all of the "Pos_00{i}" to "Pos_000" and update the "well" metadata to these names then the viewer works accordingly. Can someone show me where this ome-metadata is being parsed or why this might be the case?

I am happy to help investigating / solving these issues if you are looking for development help. Same goes for ome-zarr-py + napari-ome-zarr (which seems a bit more buggy than vizarr) which would be our preferred viewer.

Thanks for your help here.

Best,
Cam

The text was updated successfully, but these errors were encountered:

manzt · 2021-09-09T20:24:33Z

Hello there - thank you for your interest in using and improving vizarr! I do not work on ome-zarr-py or napari-ome-zarr directly (pinging: @joshmoore, @will-moore), but I can certainly help investigate this issue.

My guess is that vizzar makes some assumptions about the metadata structure that reflect OME-NGFF published by the IDR (since these have been examples we work with primarily), and our HCS metadata traversal can be improved. If you have example data to share, that would help get to the bottom of this quickly.

Here is the part of the code where metadata-traversal begins. The node passed to the viewer is inspected to see if it adheres to the well, plate, or multiscales attrs

vizarr/src/io.ts

Lines 111 to 124 in 1585a03

    
           if (node instanceof ZarrGroup) { 
        
             const attrs = (await node.attrs.asObject()) as Ome.Attrs; 
        
             if ('plate' in attrs) { 
        
               return loadPlate(config, node, attrs.plate); 
        
             } 
        
             if ('well' in attrs) { 
        
               return loadWell(config, node, attrs.well); 
        
             } 
        
             if ('omero' in attrs) { 
        
               return loadOmeroMultiscales(config, node, attrs); 
        
             }

will-moore · 2021-09-10T09:16:55Z

Hi Cam,
So I guess the loadPlate() code (

vizarr/src/ome.ts

Line 110 in 1585a03

    
           export async function loadPlate(config: ImageLayerConfig, grp: ZarrGroup, plateAttrs: Ome.Plate): Promise<SourceData> {

) is making some assumptions in order to reduce the number of calls it has to make.

We only load the first Well to get the path from a Well to an Image (instead of loading all the Wells in the plate).

const wellAttrs = (await grp.getItem(wellPaths[0]).then((g) => g.attrs.asObject())) as Ome.Attrs;

That 'imgPath' is then applied to all the other wellPaths that we have for the plate, so that we can load an Image for each Well.

wellPaths.map((p) => [p, join(p, imgPath, resolution)]),

For this to work for your data, we'd need to get an imgPath for each Well.
This is just one more JSON request for each Well, which is probably not going to be a killer for a small-medium sized plate.
Unfortunately we don't have any way to know whether a plate uses the same path for every well without loading them all. So we'd have to do this for every plate in vizarr, and I worry that the sheer number of calls for a 384-well plate is going to slow things down. That might depend on the back-end @joshmoore?

Do you want to try coding this up?
Running vizarr locally is simply a case of $ npm install and $ npm start.
Or I could open a PR and let you test the build from that?

Will.

will-moore · 2021-09-10T09:26:52Z

Part of the problem is the complexity of the HCS spec, which others have also noted.
We are currently discussing the evolution of this into a more generic "Collections" spec at ome/ngff#31
For example, see the outline at ome/ngff#31 (comment)
This should help address the issues we're having here because all the paths to images and other info you need are in a single .zattrs blob, instead of being distributed into many individual blobs for each Well.

If you're able to contribute to that discussion it would be great to find improvements that work for as many use-cases as possible.
Thanks,
Will.

camFoltz · 2021-09-13T15:49:16Z

Thanks @will-moore I see the PR but if you still need more help let me know! I will look over the discussion and contribute with our team's inputs.

will-moore mentioned this issue Sep 10, 2021

Load each Well to get path to first image for each #119

Merged

manzt closed this as completed in #119 Sep 26, 2021

will-moore mentioned this issue Jul 22, 2022

Plate labels fix ome/ome-zarr-py#207

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OME-HCS Compatibility with vizarr #118

OME-HCS Compatibility with vizarr #118

camFoltz commented Sep 9, 2021

manzt commented Sep 9, 2021

will-moore commented Sep 10, 2021

will-moore commented Sep 10, 2021

camFoltz commented Sep 13, 2021

OME-HCS Compatibility with vizarr #118

OME-HCS Compatibility with vizarr #118

Comments

camFoltz commented Sep 9, 2021

manzt commented Sep 9, 2021

will-moore commented Sep 10, 2021

will-moore commented Sep 10, 2021

camFoltz commented Sep 13, 2021