Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate ome.zarr file with the python library? #400

Open
constantinpape opened this issue Aug 30, 2024 · 7 comments
Open

Validate ome.zarr file with the python library? #400

constantinpape opened this issue Aug 30, 2024 · 7 comments

Comments

@constantinpape
Copy link
Contributor

Is there a way to validate a ome.zarr file with the python library to check if it follows the spec?

I checked the documentation, but could not find a dedicated function for it.

The closest I found was ome_zarr info, but for the file I want to validate it does not yield any output:

$ ome_zarr info ngff-v2/Platynereis-H2B-TL.ome.zarr

I have made the file available for testing here: https://drive.google.com/file/d/1WSHQWkOXUfSBahOJdVzrKbj91mczKhZP/view?usp=sharing.

And any other way to validate the file would also be fine with me.

@d-v-b
Copy link

d-v-b commented Aug 30, 2024

try this (I definitely need to add a CLI like this to pydantic-ome-ngff)

# /// script
# requires-python = ">=3.9"
# dependencies = [
#   "pydantic-ome-ngff==0.6.1",
#    "zarr < 3.0.0"
# ]
# ///
import zarr
from pydantic_ome_ngff.v04 import MultiscaleGroup
import sys
fname = sys.argv[1]
group = zarr.open_group(fname, mode='r')
print(f'validating {fname}')
try:
    MultiscaleGroup.from_zarr(group)
    print(f'validation of {fname} succeeded.')
except ValueError as e:
    print(f'validation failed  with the following message: {e}')

invoke it with hatch or uv or any other tool that understands the python stand-alone script syntax:

bennettd@dvb-desktop-0 ➜  pydantic-ome-ngff git:(main) ✗ hatch run validate.py ~/Downloads/Platynereis-H2B-TL.ome.zarr/c0-t0
validating /home/bennettd/Downloads/Platynereis-H2B-TL.ome.zarr/c0-t0
validation of /home/bennettd/Downloads/Platynereis-H2B-TL.ome.zarr/c0-t0 succeeded.

According to my tool, c0-t0 is a valid multiscale group, but the root group is not, because the root group does not contain the multiscales metadata (as expected I think).

@d-v-b
Copy link

d-v-b commented Aug 30, 2024

when I look at your metadata, the only thing that stands out (besides the lack of a translation transformation defined for each scale level) is the unit, which is non-standard but the spec is not normative about the unit, so that shouldn't be a validation error.

@joshmoore
Copy link
Member

ome_zarr view ...fileset... will use the Javascript based validator locally.

@constantinpape
Copy link
Contributor Author

Thanks for the feedback:

  • regarding the metadata in the file: indeed the problem seems to be that there is no multiscales in group.
  • pydantic_ome_ngff: good to know about this :)
  • ome_zarr view: nice! although for the specific use case I would need something that works without a browser (to validate directly on a cluster)

The answers take care of my immediate issue, but I think it would be nice to have a straight forward CLI for validation (ideally here in the library, but via pydantic_ome_ngff would also be a good solution). I will leave it open, but feel free to close if not relevant.

@d-v-b
Copy link

d-v-b commented Aug 30, 2024

but I think it would be nice to have a straight forward CLI for validation

Agreed, for that effort it would be helpful to know what exactly you want to validate. There are a few scenarios:

  • validate that a single zarr group is a valid multiscale group (what my code snippet checks)
  • validate that all of the immediate sub-groups of a zarr group are multiscale groups (i think this is what you want)
  • validate that all groups in a hierarchy are either multiscale groups, or contain multiscale groups (a generalization of your situation)

these correspond to rather different data models.

@constantinpape
Copy link
Contributor Author

@d-v-b : for my use-case I would want to validate that all groups in a hierarchy are either multi-scale groups or contain multiscale groups.

For some context: this issue arose while converting light-sheet data to ome.zarr for the ome-ngff-challenge. For this data we currently use the BDV.N5 data model and somehow need to map this to ome.zarr. See ome/ome2024-ngff-challenge#45 for details.

@joshmoore
Copy link
Member

  • although for the specific use case I would need something that works without a browser (to validate directly on a cluster)

👍 for that in general.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants