-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expand scope of write_elem
wrapper
#500
Comments
Here are the typical ways we create tables. Let's see if/what we can abstract out of them. Create-ome-zarr task # Create tables zarr group for ROI tables
group_tables = group_image.create_group("tables/") # noqa: F841
well_id = row + column
# Prepare AnnData tables for FOV/well ROIs
FOV_ROIs_table = prepare_FOV_ROI_table(site_metadata.loc[well_id])
well_ROIs_table = prepare_well_ROI_table(
site_metadata.loc[well_id]
)
# Write AnnData tables in the tables zarr group
write_elem(group_tables, "FOV_ROI_table", FOV_ROIs_table)
write_elem(group_tables, "well_ROI_table", well_ROIs_table)
group_tables.attrs["tables"] = ["FOV_ROI_table", "well_ROI_table"] Copy-ome-zarr task # Copy the tables in ROI_table_names
for ROI_table_name in ROI_table_names:
logger.info(
f"I will now read {ROI_table_name} from "
f"{zarrurl_old=}, convert it to 2D, and "
"write it back to the new zarr file."
)
ROI_table = ad.read_zarr(
f"{zarrurl_old}/{well_path}/{image_path}/"
f"tables/{ROI_table_name}"
)
# Convert 3D FOVs to 2D
if project_to_2D:
new_ROI_table = convert_ROIs_from_3D_to_2D(
ROI_table, pxl_size_z
)
# Write new table
write_elem(
new_tables_group, ROI_table_name, new_ROI_table
) Cellpose task if output_ROI_table:
# Concatenate all ROI dataframes
df_well = pd.concat(bbox_dataframe_list, axis=0, ignore_index=True)
df_well.index = df_well.index.astype(str)
# Extract labels and drop them from df_well
labels = pd.DataFrame(df_well["label"].astype(str))
df_well.drop(labels=["label"], axis=1, inplace=True)
# Convert all to float (warning: some would be int, in principle)
bbox_dtype = np.float32
df_well = df_well.astype(bbox_dtype)
# Convert to anndata
bbox_table = ad.AnnData(df_well, dtype=bbox_dtype)
bbox_table.obs = labels
# Write to zarr group
group_tables = zarr.group(f"{in_path}/{component}/tables/")
write_elem(group_tables, output_ROI_table, bbox_table)
logger.info(
"Bounding box ROI table written to "
f"{in_path}/{component}/tables/{output_ROI_table}"
) Napari-workflows task # Output handling: "dataframe" type (for each output, concatenate ROI
# dataframes, clean up, and store in a AnnData table on-disk)
for (name, out_params) in dataframe_outputs:
table_name = out_params.table_name
# Concatenate all FOV dataframes
list_dfs = output_dataframe_lists[name]
if len(list_dfs) == 0:
measurement_table = ad.AnnData()
else:
df_well = pd.concat(list_dfs, axis=0, ignore_index=True)
# Extract labels and drop them from df_well
labels = pd.DataFrame(df_well["label"].astype(str))
df_well.drop(labels=["label"], axis=1, inplace=True)
# Convert all to float (warning: some would be int, in principle)
measurement_dtype = np.float32
df_well = df_well.astype(measurement_dtype)
df_well.index = df_well.index.map(str)
# Convert to anndata
measurement_table = ad.AnnData(df_well, dtype=measurement_dtype)
measurement_table.obs = labels
# Write to zarr group
group_tables = zarr.group(f"{in_path}/{component}/tables/")
write_elem(group_tables, table_name, measurement_table)
# Update OME-NGFF metadata
current_tables = group_tables.attrs.asdict().get("tables") or []
if table_name in current_tables:
# FIXME: move this check to an earlier stage of the task
raise ValueError(
f"{in_path}/{component}/tables/ already includes "
f"{table_name=} in {current_tables=}"
)
new_tables = current_tables + [table_name]
group_tables.attrs["tables"] = new_tables
# FIXME: also include table metadata, see issue #333 |
Something that clearly can be moved into This means that the first version of this function could look like def write_table(image_group, table_name, table, overwrite=False):
# Create tables group, if needed
...
# Check whether subgroup already exists, and proceed accordingly
...
# If it's all OK, proceed and write the table
...
# Update the `tables` metadata of the image group
... Let's see if we can include more general features in this function. |
The first prototype is available through f4f127e, comments on the scope are welcome. |
I like the approach to |
Looks good to me! We can then work more on this wrapper when we formalize ROIs more (e.g. what metadata should our default ROI tables have?). It's already a great first step towards it :) |
On the
write_elem
wrapper: can we take this opportunity to make this a broader wrapper that handles more of the table writing?write_elem
seems like a low level wrapper function. Would be great if tasks can just call awrite_table
or something like that with some parameters. Currently, many task replicate similar handling of the table metadata, updating the list of tables, checking for existing tables etc.This would be a good opportunity to generalize this and have a useful wrapper that sets the correct metadata for those tables, but that someone building a new task then doesn’t have to think about.
Originally posted by @jluethi in #485 (comment)
The text was updated successfully, but these errors were encountered: