-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
check on c_coords
should be one the output data in map_raster
#597
Comments
CC @ArneDefauw |
I've added PR for this issue, see #599 I wonder if support should also be added to spatialdata for relabeling across chunks to avoid label collisions if going from image->labels |
Thanks! That's an interesting API! It could be implemented making use of the new |
yes, indeed. Would you prefer adding it as a parameter to Wouldn't be too difficult to implement, I can prepare a PR if you would be interested. |
that sounds really useful @ArneDefauw ! I think it depends a bit, if it's just a single argument, then maybe can be done in map_raster? as in, we have already a very extensive list of functions/methods in the repo, but I also understand on striking a balance of composability and fleixiblity |
Good point! If using |
Developing more on this, I guess that signature of the separate function would be relabel(labels: DataArray | DataTree, relabel: dict[str: str]) while the usage with
So pretty much equivalent. I propose to:
|
I agree, I think we can first try to wrap it in map_raster. @LucaMarconato , I don't think I understand the type of relabel (
...
|
I think it sounds great @ArneDefauw ! I agree on the type of |
I believe we are thinking of different relabeling use cases. my interpretationWhat I was thinking is of having the relabeling operation as a function that takes as input a mapping between the old labels values and the new label values, with the extra assumption that the mapping is injective. The type would be The idea is that we could have the unique values The relabeling function is straightforward to implement, and I guess the most common use case would be to give a set of indices without gaps, like in the example above (where the gap between 1, 3 and 3, 5, disappears). your interpretationI haven't understood the use case for your code, could you please explain it? What are the properties of the new relabeling?
final comment
I agree, if the relabeling is performed we could also take care of ensuring the annotating table still match. |
The relabelling is needed because the segmentation masks ids are unique within each block, but they are not connected in the global mask, hence you need to relabel based on the connected segmentation masks across blocks, you can see how it is done in squidpy here: https://github.com/scverse/squidpy/blob/7c5c60c47316f3f75f2d73aa684d13bccfbc2662/src/squidpy/im/_segment.py#L105-L135 I am not sure there is a realistic use case when the user wants to pass a set of segmentation masks ids used to relabel the existing ids. In that case however, it could simply be a lambda function + |
Now I understand the confusion @LucaMarconato , we are indeed thinking of different use cases :). I think the following use cases can be isolated.
a) Segmentation. If your callable converts image layers to labels layers, you are basically doing segmentation, and you need a way to handle the predictions done by e.g. cellpose on each block. Cellpose, or any segmentation algorithm will predict labels ranging from 1 to say 100 on each block. So different cells in different blocks can be assigned the same label, so hence you need a way to make the labels unique across blocks. 'Does your code guarantee that the relabeling is injective? I could imagine that if block_num is something like 0xffffffff, then all the labels would collapse to that number.''
And if you then would have labels ranging up to max_uint16+1 in a block, then they would be mapped to 0 if you do max_uint16<<16. As @giovp mentioned, after this relabeling, you still need a way to connect them in the global mask. I found that squidpy's way of connecting them, using dask-image functions like label_adjancency_graph, leads to issues for dense cell regions . So either we choose another method, e.g. as in https://github.com/gletort/NeubiasPasteur2023_AdvancedCellPose/blob/main/distributed_segmentation.py (this method worked better); or we let the user connect the predicted mask by the algorithm of choice. I.e., we let the user solve the chunking artifacts on the border of the chunks. b) Merging labels layers, i.e. [ label_layer_1, label_layer_2 ] -> merged_label_layer is currently not supported by map_raster because we do not allow passing more than one DataArray. But you would want to handle relabeling in same way as in a. |
Ok thank you for the explanation, now it's clear.
I would default for the sequential labeling then. We are thinking of having a package/some code that deals with image feature extraction. This function could live in such a package and be implemented, as Giovanni suggested, by passing the correct lambda function to |
You are right, it would be ok. |
If I am getting the problem right, you are trying to have a way to detect cells that overlapped to a block border, and make sure that their label value is the same globally. In such a case, in SOPA, the external segmentation algorithms return polygons/multipolygons (or when they return raster, the segmentation is converted to polygons/multipolygons), and then a conflict resolution strategy is applied to the borders. IIRC basically with the same short code shown here (or with some equivalent code):
Practically speaking, I would not worry about connecting adjacent pixels from different blocks here. The user can always do this manually or rely on strategies such as the one implemented in SOPA, to achieve that. Please let me know what you think about this. |
mmh I don't think this requires a separate package, since this processing step might be required right after a |
@giovp I meant the relabeling of raster types that I described under "my intepretation" here #597 (comment) (sorry for the confusion). Instead the fix described under "your interpretation" should indeed be implemented here, and I am now convinced that the byte shifting approach would be good. In conclusion, @ArneDefauw if you could implement the bit shifting approach that would be amazing! And to regard to the "relabeling of segmentation masks as of my interpretation", I'd skip the implementation for now. |
Ok, I will implement the bit shifting approach! Should not be too difficult to implement, i can recycle some code here and there :) Re connecting adjacent pixels from different blocks. I have not tested the method implemented by SOPA, but in my opinion having to convert to a shapes layer to fix these chunking artifacts is an unnecessary conversion step; these chunking arifacts can be fixed working on raster data only, and prevents having to go back and forth between a shapes layer and a labels layer. Also, I wonder if a z dimension is supported if you convert to a shapes layer to fix these artefacts. So if I understand correctly, for now we do not implement the optional sequential relabeling? |
Thanks a lot!
Nice, let's keep this open for a follow up PR. I'd go for the name
Maybe let's have this just as internal function but I would not expose this in a separate API now. |
You are right, the check on
c_coords
should be one the output data. I merged already because I need to merge #594, which was opened against the current PR. Please can you make a patch for fixing that behavior?Originally posted by @LucaMarconato in #588 (comment)
The text was updated successfully, but these errors were encountered: