Skip to content

Pythonic and parallelizable I/O for N-dimensional imaging data with OME metadata

License

Notifications You must be signed in to change notification settings

aaronalvarezcz/iohub

 
 

Repository files navigation

iohub

PyPI - Python version PyPI - iohub version Docs deployment

N-dimensional bioimaging produces data and metadata in various formats, and iohub aims to become a unified Python interface to the most common formats used at the Biohub and in the broader imaging community.

Supported formats

Read

  • OME-Zarr (OME-NGFF v0.4)
  • Micro-Manager TIFF sequence, OME-TIFF (MMStack), and NDTiff datasets
  • Custom data formats generated by Biohub microscopes
    • Supported: Falcon (PTI), Dorado (ClearControl), Dragonfly (OpenCell OME-TIFF), Mantis (NDTiff)
    • WIP: DaXi

Write

  • OME-Zarr
  • Multi-page TIFF stacks organized in a directory hierarchy that mimics OME-NGFF (WIP)

Quick start

Installation

Install a pre-release of iohub with pip:

pip install iohub

Or install the latest Git version:

git clone https://github.com/czbiohub-sf/iohub.git
pip install /path/to/iohub

For more details about installation, see the related section in the contribution guide.

Command-line interface

To check if iohub works for a dataset:

iohub info /path/to/data/

The CLI can show a summary of the dataset, point to relevant Python calls, and convert other data formats to the latest OME-Zarr. See the full CLI help message by typing iohub or iohub [command] --help in the terminal.

Working with OME-Zarr

Load and modify an example OME-Zarr dataset:

import numpy as np
from iohub import open_ome_zarr

with open_ome_zarr(
    "20200812-CardiomyocyteDifferentiation14-Cycle1.zarr",
    mode="r",
    layout="auto",
) as dataset:
    dataset.print_tree()  # prints the hierarchy of the zarr store
    channel_names = dataset.channel_names
    print(channel_names)
    img_array = dataset[
        "B/03/0/0"
    ]  # lazy Zarr array for the raw image in the first position
    raw_data = img_array.numpy()  # loads a CZYX 4D array into RAM
    print(raw_data.mean())  # does some analysis

with open_ome_zarr(
    "max_intensity_projection.zarr",
    mode="w-",
    layout="hcs",
    channel_names=channel_names,
) as dataset:
    new_fov = dataset.create_position(
        "B", "03", "0"
    )  # creates fov with the same path
    new_fov["0"] = raw_data.max(axis=1).reshape(
        (1, 1, 1, *raw_data.shape[2:])
    )  # max projection along Z axis and prepend dims to 5D
    dataset.print_tree()  # checks that new data has been written

For more about API usage, refer to the documentation and the example scripts.

Reading Micro-Manager TIFF data

Read a directory containing a TIFF dataset:

from iohub import read_micromanager

reader = read_micromanager("/path/to/data/")
print(reader.shape)

Why iohub?

This project is inspired by the existing Python libraries for bioimaging data I/O, including ome-zarr-py, tifffile and aicsimageio. They support some of the most widely adopted and/or promising formats in microscopy, such as OME-Zarr and OME-TIFF.

iohub bridges the gaps among them with the following features:

  • Efficient reading of data in various TIFF-based formats produced by the Micro-Manager/Pycro-Manager acquisition stack.
  • Efficient and customizable conversion of data and metadata from TIFF to OME-Zarr.
  • Pythonic and atomic access of OME-Zarr data with parallelized analysis in mind.
  • OME-Zarr metadata is automatically constructed and updated for writing, and verified against the specification when reading.
  • Adherence to the latest OME-NGFF specification (v0.4) whenever possible.

About

Pythonic and parallelizable I/O for N-dimensional imaging data with OME metadata

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%