-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SpatialData lacks Observation Window Metadata #458
Comments
Hello @MobiTobi ! Thank you for writing out this issue. I have a few clarifying questions:
I'm not sure I understand these questions. Can you please clarify? |
I think that there should be standardized observation window metadata for elements.
A shapely Polygon or MultiPolygon would be enough to model
Sure :) I see how I could have phrased it better. With the questions I wanted to point out the ambiguities that come with storing windows on the same level as the observation data.
It tells which subset of plane is the domain where the data was observed and valid to include in subsequent analyses. The information needs to already come from the experimentalists. It's basically an outline of the measurement area with holes for invalid regions. For many spatial omics datasets the observation window is obvious. In a perfect world everyone uses spatialdata :^) and a mistake like that will be impossible, because the metadata makes it obvious that it is outside of the regular osmFISH measurements. |
Hi @MobiTobi, thanks for reporting this and for the explanation. Unfortunately the data from many commercial technologies doesn't come with clear information on the observation window, so I would not add this as a standardized field. I think instead the information on the observation window should live at a different layer, more focused on metadata and qc, than the storage format used by SpatialData. @melonora is working on this direction, so this is something that can be considered in the future in that context. |
A few more comments.
As you observed, the I'd also consider discussing this in the context of the NGFF data specification. Please see here two related discussions (even if not exactly on this topic): ome/ngff#31 ome/ngff#133. |
Hi @MobiTobi, As @LucaMarconato mentioned I am indeed working in this direction. Particularly I am working on schemas that would allow to extract this information and to put it into for example a SQL database. There are a couple of things that are required to make this work in a truly FAIR manner. If you would like I would be happy to set up a call to discuss. |
We currently use shapes elements for the regions of measurement (or regions of interest), annotated with IDs in the table, which other elements then reference. Raster images inheritently have bounds, but also there the relevant measurement is often only within a region of interest.
|
Thank you for the discussion. I will close the issue as developing a specification for the observation window is not in scope of |
tl;dr:
Observation windows are a crucial piece of spatial metadata.
It should be possible to associate
SpatialData
objects and elements with observation windows.Polygon elements can describe observation windows, but violate FAIR principles.
Before I provide two examples of why observation window metadata is crucial for the analysis of spatial data, I want to point out that other communities learned this lesson a long time ago 1.
Let's avoid the pitfalls they stumbled into a long time ago and steal their wisdom :^)
No Data != No Measurement
The observation window allows us to distinguish between the absence of measured data points and the absence of measurement.
For example let's try to describe the density of a bunch of cells.
The results based on the area of the data bounding box or the area of the microscopy slide will yield very different results:
Bounding Boxes estimated from data are ill defined
Without an observation window users can fall back on
spatialdata.get_extend
to estimate a bounding box from the data.Unfortunately the results depend on the coordinate system:
Workarounds
With the current implementation I see two possible workarounds both suffering from the same major drawbacks:
In the first workaround users can store observations windows as a polygon element.
Does the polygon apply the complete
spatialdata
object?Or just to one or multiple elements? And if so to which elements?
Alternatively users can store anything in the
.uns
attribute of the element table resulting in a clear association between observation window and element but a bad user interface.What can we do about it?
The obvious solution would be to add a
window
orobservation_window
attribute tospatialdata
objects and their elements.This would be easy to use and obey the "findable" requirement.
For now we should give users a heads-up about observation windows in the user manual.
The
spatialdata
target audience are biologists without formal training in spatial analysis.For me it would not have been obvious at all to explicitly mark the observation window before sharing data with others.
What do you think about making observation windows first class metadata in
spatialdata
?Do you have a good idea of how we could get there?
Footnotes
Baddeley, Rubak, Turner. Spatial point patterns: methodology and applications with R, CRC press, 2015. ↩
The text was updated successfully, but these errors were encountered: