-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suggestions for ShapelySTRTreeIndex #3
Comments
"Exact alignment" is confusing, it would either mean:
Does shapely provide ways to do 1 and/or 2 efficiently? Is it possible to create and use a "classic" |
Thanks for this!
+1 on that
We can potentially move the tree creation to
I think it worth exploring but I'll need to spend a bit more time with this to wrap my mind around the idea and implementation details. It is certainly not a top priority now.
Geopandas takes the bounds passed to
On this I am not sure, especially because I don't have a mental model of how xarray's join works under the hood.
A perfect match can be tested using pandas Index can be created and in the example notebook it is a pandas index what I did before you created the custom one. It just has a bit limited functionality. One other thing I would add is to support query-based selection for arrays (already have it on a local branch, will push later). |
That would be already useful. Here is a broader proposal: instead of a
Assuming that it is reasonably cheap to create both a shapely tree (lazy) and a pandas index from an array of geometries…
Ah interesting! I'm curious on how does the resulting DataArray / Dataset look like. |
That sounds good! I assume it will be relatively cheap to create both. I guess that if we subclass
see #4 |
@martinfleis I'm happy to give it a shot, unless you are already on it? |
Go for it! |
Here are a few suggestions for
ShapelySTRTreeIndex
:Automatically create a new index from selected geometries
Currently, calling
.sel()
or.isel()
with a coordinate or dimension baked by aShapelySTRTreeIndex
won't add a new index in the resulting Dataset / DataArray. Users have to manually call.set_xindex()
again if they want to further narrow the selection.We should probably create a new index instance automatically for convenience. This is supported by the Xarray
Index
API.Lazy index
Building the R-Tree may be expensive. If we allow automatic creation of new index instances like suggested above, it is probably a good idea to defer the R-Tree build until it is actually needed. If I understand correctly, Geopandas follows a similar approach?
Lazy or direct build could be controlled by an
__init__
option. When setting the index explicitly via.set_xindex()
, deferring the build is probably not what users want.Support arbitrary dimensions
This may not be high-priority, but it might be useful to support coordinates with an arbitrary number of dimensions. A few use-cases are detailed here.
I'm not sure how this would work for "orthogonal" indexing (i.e., each dimension indexed separately) since the selection result may be a sparse n-d coordinate, but for "point-wise" advanced indexing (see docs) this could work pretty well. Here is an example of implementation for converting selected indices returned from a flat index back to the original coordinate shape.
Other index operations (alignment, roll, etc.) may be hard to support with n-d coordinates.
Support
slice
objects passed tosel
?Inspired from GeoSeries.cx. Here it would looks like
ds.sel(geom_coord=(xslice, yslice))
. That said, using ashapely.box
works already great so slices may not add much value here.Spatial join and Xarray alignment
Should we leverage the full capabilities of
ShapelySTRTreeIndex
for automatic alignment of Xarray objects with geometry coordinates? It may be tricky:xarray.align()
exposes ajoin
option but doesn't support arbitrary options. It would be hard to add such support as this function is used extensively in Xarray internals for many operations. We would need to find other ways of passingpredicate
to the underlyingshapely.STRtree.query()
(e.g., index__init__
option? context manager?)Index.join()
andIndex.reindex_like()
and where those two methods are called at different stages of the procedure. Not sure how we could avoid two repetitive calls toshapely.STRtree.query()
.Probably it is a bad idea and it'd be better that:
ShapelySTRTreeIndex
only supports exact alignmentThe text was updated successfully, but these errors were encountered: