-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bioformats2raw export: unify Zarr layout and add OMERO metadata #76
Conversation
- pass --series argument using image.series - pass --no-root-group and --no-ome-meta - rename exported image folder as image_id.zarr
Pass -profile black to isort pre-commit to solve incompatibility when committing
abs_path = Path("/") / Path(p) | ||
else: | ||
if self.client is None: | ||
raise Exception("This cannot happen") # mypy is confused |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This happens if the field is ever assigned with a None
like in
omero-cli-zarr/src/omero_zarr/cli.py
Line 70 in 0835e4f
self.client = None # type: ignore |
I don't know if changing that to del self.client
keeps mypy and everyone else happy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error is still confusing to me. I don't understand why mypy
does not throw the same error with self.gateway
which has the same behavior as self.client
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's certainly an element of magic involved, at least partially related to finally
s and other if
checks.
Initial set of sample files generated with and without --bf and uploaded to a temporary public bucket for comparison From the performance perspective
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few minor thoughts but no real worries.
@@ -294,7 +308,21 @@ def _bf_export(self, abs_path: Path, args: argparse.Namespace) -> None: | |||
if stderr: | |||
self.ctx.err(stderr.decode("utf-8")) | |||
if process.returncode == 0: | |||
self.ctx.out(f"Image exported to {target.resolve()}") | |||
image_source = target / "0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's always 0
despite the series index? Interesting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes --series x,y
is removing the series from the root metadata object and hence reindexing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunate, it means that you can't then make use of OME-XML if enabled.
) | ||
root = open_group(store) | ||
add_omero_metadata(root, image) | ||
add_toplevel_metadata(root) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this just sets _creator
. Is it almost not more appropriately bf2raw, or a mix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely a mix. There is some Bio-Formats metadata under multiscales/metadata
but more could be captured.
Happy to look into capturing that but I am a bit worried you'll ask me to solve #48 ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want to, ... 😉
This worked once I'd updated to the One difference I noticed was in the number of pyramid levels exported is different. Known difference: |
Thanks.
Yes that's one of the implementation differences which comes from different defaults in the maximal size for the smallest resolution: 96 for omero-cli-zarr
Yes that's captured as glencoesoftware/bioformats2raw#113 |
) | ||
export.add_argument( | ||
"--max_workers", default=None, help="For use with bioformats2raw" | ||
"--max_workers", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not specific to this PR, but I could imagine having a method for setting arbitrary bioformats2raw
properties:
--bf:max_workers=
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
interesting. Could even go as far as --bf2raw-config=<config_file>
with an INI style of YAML list of key/value pairs to be passed to bioformats2raw
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, if you're going to go that far, I'd take a config file for all of the properties here, and separate into [bioformats2raw]
and [default]
(or [omero]
) ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we turn this into an issue?
Another large scale example of usage of this PR. Trying to convert the >1TB lightsheet dataset from McDole et al (https://doi.org/10.17867/10000116):
So after 2.5 days of processing + S3 upload, the data can be viewed from https://hms-dbmi.github.io/vizarr/?source=https://uk1s3.embassy.ebi.ac.uk/omero-cli-zarr_76/bf/4007801.zarr |
From the outstanding points of this discussion, the command-line overhaul has been turned into an issue. And glencoesoftware/bioformats2raw#114 should allow to align the number of resolutions generated with/without Any objections to getting this merged @joshmoore @will-moore ? I would propose a release of the plugin with bioformat2raw 0.3.0 support and start capturing the next items to review as issues. |
SGTM 👍 |
Follow-up of #75, this PR:
omero export Image:<id>
. The image series is passed to bioformats2raw and the unique image group (<fileset>/0
) is renamed as<id>.zarr
raw_pixels
The
add_omero_metadata
andadd_toplevel_metadata
are arguably outside the scope of theraw_pixels
module and could be moved elsewhere (utils
? new module?)