Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linking CRS with geospatial coordinate or data variables #5

Open
benbovy opened this issue Dec 5, 2024 · 2 comments
Open

Linking CRS with geospatial coordinate or data variables #5

benbovy opened this issue Dec 5, 2024 · 2 comments

Comments

@benbovy
Copy link
Owner

benbovy commented Dec 5, 2024

The main goal of xproj is to provide a reusable way to deal with CRS across Xarray geospatial extensions. This is done using scalar coordinate variables with a CRSIndex + a .proj accessor.

However, one big challenge with this decoupled approach is: how to “link” the CRS-indexed coordinate(s) with other geospatial coordinates, indexes and/or data variables in an Xarray dataset or dataarray?

Some concrete examples:

  • accessing the CRS information from within an Xarray index such as xvec.GeometryIndex
  • eventually support (re-)projection (i.e., coordinate transformation and maybe data variable resampling done in 3rd-party Xarray extensions) via something like .proj.to_crs(…)

We can decompose this issue into the following sub-problems:

  1. identify spatially-dependent coordinate and data variables in a dataset or dataarray
  2. in the case of a multi-CRS model (see Single vs. multi CRS datasets #2), link the variables to their corresponding CRS
  3. expose those links transparently to other Xarray extensions via a convenient API
  4. ensure those links are kept in-sync when operating on Xarray objects
  5. trigger data transformation and/or resampling logic implemented elsewhere

CF-conventions address 1 and 2 with the grid_mapping attribute. Perhaps we could use the same approach here? My main concern is to avoid being too opinionated and going down the rabbit hole by enforcing a too strict CF data model here. Probably a good source of inspiration is to look at how rioxarray handles that?

One issue with 4 is that operations on Xarray objects can cause (meta)data loss, although this could probably be solved with the combined use of the .proj accessor and CRSIndex.

Maybe we can define some protocol for 5?

Thoughts or ideas very much appreciated!

@benbovy
Copy link
Owner Author

benbovy commented Dec 12, 2024

A solution for 5 has been implemented in #6 (we might want to revisit that later, though).

@benbovy
Copy link
Owner Author

benbovy commented Dec 12, 2024

Now I'm wondering whether xproj should take care of 1-4 or if this would rather be the responsibility of other Xarray extensions like rioxarray or xvec?

xproj's scope could merely consist in a "lightweight, Xarray-compatible wrapper around pyproj". While a pyproj.crs.CRS object contains axis information, the way those axes are translated to Xarray dimensions and coordinates highly depends on the case, e.g.,

  • raster data cube (rioxarray): axis information is explicitly translated to x, y coordinates (either 1-dimensional or 2-dimensional)
  • vector data cube (xvec): axis information is encapsulated in the shapely.Geometry objects of 1-dimensional Xarray coordinate(s) (dimension = geometry column of a GeoPandas dataframe).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant