-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support arrays of mixed-dimensionality #150
Comments
Very good overview @tcompa
Actually, arrays can be n-dimensional. We always expect YX to be there. Anything else is optional. There will often be Z (though not always, we'll need to make the 2D only case work as well, see #124). There often will be multiple channels (those typically can just be looped over) and there may be time information (sometimes to be looped over, i.e. process timepoint by timepoint, e.g. for segmentation. Some other times we'll need to process whole time series at once, e.g. to do tracking).
That seems good to me. We can be somewhat conservative in adding dimensions. Let's make sure 2D only (#124) and time data (#169) can be parsed, but hold off on more complex logic.
=> Sounds good to me. Let's add complexity where needed for the two issues above. I'll work on small test sets. The 2D is ready, the time one I will need to look into.
The seems like a very good approach to make sure we're stable when users start introducing different dimensions, when we only have specific ones.
Lets:
Good point. But our current approach should scale quite a while, I hope. Let's re-asses this if it becomes necessary |
Adding to this issue, work in https://github.com/fractal-analytics-platform/fractal-tasks-core/pull/557/files introduces the functions In the future, also these new functions will need to be made more flexible (that is, they should not always require the Z pixel size). |
At the moment all our image arrays are 4D (CZYX) and each one of our label arrays is 3D (ZYX). This property is visible in the
.zarray
files, and in the folder structure. When the dimension along Z is dummy (a single Z plane), we still use the 4D/3D structure, with shape like(num_channels, 1, num_y, num_x)
or(1, num_y, num_x)
. Also ROIs are defined in the same way: they are always 3D shapes (defined by 6 numbers), and in some cases the Z part is dummy (starting at 0 and ending atpixel_size_z
, corresponding to a single pixel).The perspective is that we will handle arrays with mixed dimensions, which can be up to 5D (TCZYX) but also lack some of the intermediate channels (like TCYX), see #149 (comment):
Broadly speaking, a possible (preliminary!) plan to support this general case would be to
Re: point 1
This means that
create_zarr_structure
andyokogawa_to_zarr
would include more logic, to choose the right structure of the target zarr array. This may include something like explicit user-provided parameters on the structure one should expect, or inference from the metadata if that's sufficiently robust. As always, the simplest is to have a couple of small test folders with different cases (e.g. CZYX, TCZYX, TCYX, and YX?)Re: point 2
This may be a bit complex, but the nice advantage is that we would be moving even closer to OME-NGFF specs.
Note that sometimes we already have to specify named axes in the OME-NGFF metadata, e.g. in
fractal-tasks-core/fractal_tasks_core/napari_workflows_wrapper.py
Lines 204 to 215 in f85f880
Re: point 3
It should not be too challenging for functions with numpy arrays as inputs/outputs (thanks to broadcasting rules). It could be a bit trickier with dask arrays, but my feeling is that we are currently moving towards a direction where dask is mostly used to lazily-load arrays and organize the processing of several small parts (note that this could change, e.g. if we push towards in-task ROI-parallelization, and we may need to depend more heavily on dask arrays.. to be assessed).
The text was updated successfully, but these errors were encountered: