Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Name option #147

Open
wants to merge 11 commits into
base: master
Choose a base branch
from
9 changes: 9 additions & 0 deletions src/omero_zarr/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -278,6 +278,15 @@ def _configure(self, parser: Parser) -> None:
default=None,
help="Maximum number of workers (only for use with bioformats2raw)",
)
export.add_argument(
"--name_by",
default="id",
choices=["id", "name"],
help=(
"How to name the Image or Plate zarr. Default 'id' is [ID].zarr. "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That raises the question about the extension.
In both cases, the library is used to generate ome.zarr so in one case using .zarr and in the other case .ome.zarr is confusing

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I think we've stated elsewhere:

  • strongly SHOULD (bordering on MUST): .zarr
  • SHOULD: .ome.zarr

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joshmoore You're suggesting we update the default behaviour to be [ID].ome.zarr instead of [ID].zarr?
Always using .ome.zarr (for name.ome.zarr or ID.ome.zarr) is more consistent, so I'm happy to do that, but that is more of a breaking change

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Breaking in the sense that you think current processes won't detect that something has already been exported and will re-export?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, just if you've got a workflow like:

$ omero zarr export Image:1
$ mv 1.zarr renamed_image.zarr

that workflow will need to change because 1.zarr will now be created as 1.ome.zarr

But that's not a major concern, so happy to make that change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I'm personally a bit up in the air. A benefit of .zarr filesets are that other stuff can be stored in them which is why I don't think we should ever require .ome.zarr. On the flip side, what we do now will likely be the model that other submitters to BIA tend to follow, so it does make sense to do what we think is most strategic (and update any existing workflows as necessary). All that being said, 💯 for doing whatever consistently.

"'name' is [NAME].ome.zarr"
),
)
export.add_argument(
"object",
type=ProxyStringType("Image"),
Expand Down
24 changes: 22 additions & 2 deletions src/omero_zarr/raw_pixels.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,12 +41,23 @@
from .util import marshal_axes, marshal_transformations, open_store, print_status


def sanitize_name(zarr_name: str) -> str:
# Avoids re.compile errors when writing Zarr data with the named root
# https://github.com/ome/omero-cli-zarr/pull/147#issuecomment-1669075660
return zarr_name.replace("[", "(").replace("]", ")")


def image_to_zarr(image: omero.gateway.ImageWrapper, args: argparse.Namespace) -> None:
target_dir = args.output
tile_width = args.tile_width
tile_height = args.tile_height
name_by = args.name_by

name = os.path.join(target_dir, "%s.zarr" % image.id)
if name_by == "name":
img_name = sanitize_name(image.name)
name = os.path.join(target_dir, "%s.ome.zarr" % img_name)
else:
name = os.path.join(target_dir, "%s.zarr" % image.id)
print(f"Exporting to {name} ({VERSION})")
store = open_store(name)
root = open_group(store)
Expand Down Expand Up @@ -245,7 +256,14 @@ def plate_to_zarr(plate: omero.gateway._PlateWrapper, args: argparse.Namespace)
total = n_rows * n_cols * (n_fields[1] - n_fields[0] + 1)

target_dir = args.output
name = os.path.join(target_dir, "%s.zarr" % plate.id)
name_by = args.name_by

if name_by == "name":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two quick notes:

  • this code is duplicated and could be made a method for possible sub-classing
  • I can live with this split in naming schemes, but it could be a surprise

plate_name = sanitize_name(plate.name)
name = os.path.join(target_dir, "%s.ome.zarr" % plate_name)
else:
name = os.path.join(target_dir, "%s.zarr" % plate.id)

store = open_store(name)
print(f"Exporting to {name} ({VERSION})")
root = open_group(store)
Expand Down Expand Up @@ -296,6 +314,8 @@ def plate_to_zarr(plate: omero.gateway._PlateWrapper, args: argparse.Namespace)
print_status(int(t0), int(time.time()), count, total)

# Update plate_metadata after each Well
if len(well_paths) == 0:
continue
write_plate_metadata(
root,
row_names,
Expand Down
2 changes: 1 addition & 1 deletion src/omero_zarr/util.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ def print_status(t0: int, t: int, count: int, total: int) -> None:
"""
percent_done = float(count) * 100 / total
dt = t - t0
if dt > 0:
if dt > 0 and count > 0:
rate = float(count) / (t - t0)
eta_f = float(total - count) / rate
eta = time.strftime("%H:%M:%S", time.gmtime(eta_f))
Expand Down