Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extending xpublish with new routers #50

Closed
jhamman opened this issue Aug 13, 2020 · 18 comments
Closed

Extending xpublish with new routers #50

jhamman opened this issue Aug 13, 2020 · 18 comments
Labels
enhancement New feature or request

Comments

@jhamman
Copy link
Contributor

jhamman commented Aug 13, 2020

In #29, @benbovy made it much easier to plugin new routers to the Xpublish API. Currently, there are three default routers:

  1. Base - /keys, /dict, etc...
  2. Common - /versions and datasets/endpoints
  3. Zarr - Zarr HTTP Store compatible endpoint

My intent with this issue is to explore what other routers / protocols may be applicable in this framework. Two potential candidates that have been floated before are:

  1. OPeNDAP / ERDDAP
  2. WMS

Curious what others think.

cc @lsetiawan, @benbovy, @ocefpaf, @rsignell-usgs

@jhamman jhamman added the enhancement New feature or request label Aug 13, 2020
@willirath
Copy link

willirath commented Aug 13, 2020 via email

@benbovy
Copy link
Contributor

benbovy commented Aug 13, 2020

I think it is a good idea to have built-in support in xpublish of established protocols like WMS, OPeNDAP, etc. alongside new protocols like Zarr. Xpublish is now very flexible so it should be pretty easy to add those protocols.

It looks like Xarray-leaflet provides something close to a WMS, but tightly coupled to ipyleaflet.

@benbovy
Copy link
Contributor

benbovy commented Aug 13, 2020

It looks like Xarray-leaflet provides something close to a WMS, but tightly coupled to ipyleaflet.

Mmm maybe closer to a "XYZ" service. I'm not very familiar with all variants of web mapping services.

@rsignell-usgs
Copy link
Contributor

rsignell-usgs commented Aug 13, 2020

This is exciting!

OPeNDAP for sure would be awesome.

WMS is a map service that returns JPG, PNG images, not actual data, so is the thought to use Holoviz/Datashader to deliver the WMS images?

That seems like a natural, since the "rasterize" function already delivers geoimagery at a specified pixel size! holoviz/datashader#831

@ocefpaf
Copy link

ocefpaf commented Aug 13, 2020

I would start with the OPeNDAP b/c there is a pure python implementation for the server and it would be easier. I have no idea what would take to do the same for erddap, the docs and specs are not easy to navigate.

@benbovy
Copy link
Contributor

benbovy commented Sep 17, 2020

Other ideas (formats): cf-json, netcdf-ld, covjson.

There's a good discussion + links here: pangeo-data/pangeo-datastore#3

I started playing with covjson and xpublish here: ESM-VFC/esm-vfc-api-demo#11

@benbovy
Copy link
Contributor

benbovy commented Dec 7, 2020

I just had a look at Titiler, which already has a great set of features for serving geospatial raster data: multiple tile formats (raw data or image), multiple projections, wmts, etc.

It would be great if we could avoid reinventing the wheel here, i.e., depend on Titiler to create web map tiles dynamically from xarray Datasets and serve it via Xpublish!

@vincentsarago @kylebarron -- are Titiler's router factory classes part of the public API? Do you think it would be feasible to subclass and/or adapt it so that we can replace the URL (BaseFactory.path_dependency) by our dependencies to access the xarray Dataset being served? And/or replace COGReader with a custom reader/backend?

@vincentsarago
Copy link

@benbovy thanks for the interest, To be honest, TiTiler is still in Alpha because it depends on 2 alpha/beta/rc modules: rio-tiler/cogeo-mosaic. I hope to publish the final version of those package before the end of the year but for now I'll just want the user to understand this.

are Titiler's router factory classes part of the public API?

Yes

Do you think it would be feasible to subclass and/or adapt it so that we can replace the URL (BaseFactory.path_dependency) by our dependencies to access the xarray Dataset being served?

I'm not sure to understand, but yes we build TiTiler is modularity in mind.

And/or replace COGReader with a custom reader/backend?

This is the first goal of the TilerFactory https://github.com/developmentseed/titiler/blob/master/titiler/endpoints/stac.py#L73-L78

https://developmentseed.org/titiler/concepts/customization/

@kylebarron
Copy link

kylebarron commented Dec 7, 2020

We've had discussions somewhere about Zarr vs Numpy for sending uncompressed data back to the client. zarr-developers/community#37 I've thought about making a PR to explicitly add Zarr support to titiler, but haven't had time to pursue that idea

@benbovy
Copy link
Contributor

benbovy commented Dec 7, 2020

Thanks for the quick answers!

TiTiler is still in Alpha

That perfectly fine! Xpublish is at an early stage of development too, and it is also built with modularity in mind so that we can easily experiment with new functionalities (pluggable routers).

We've had discussions somewhere about Zarr vs Numpy for sending uncompressed data back to the client. I've thought about making a PR to explicitly add Zarr support to titiler, but haven't had time to pursue that idea

That would be great, although here I'm thinking more about leveraging Titiler in order to extend Xpublish with "web mapping friendly" API endpoints (i.e., OGC-compliant endpoints, morecantile generated tiles, images/colormaps, etc.). Those endpoints would work with any data format that can be loaded with Xarray, and might also co-exist with other, non-geospatial API endpoints (e.g., serving raw multi-dimensional data using various protocols like OPeNDAP, Zarr, etc.).

I need to look at Titiler more in depth. My understanding is that a Titiler router factory rely on a dataset path (PathParams) + reader for all its endpoints, whereas a xpublish router relies on a get_dataset dependency (that directly returns a xarray.Dataset object) for all its endpoints. So I guess I need to figure out how to adapt one to the other.

@rsignell-usgs
Copy link
Contributor

@kylebarron, dynamic tile services such as titiler make good use of the overviews from COGS which we don't have in Xarray right?

I remember hearing something about overviews being discussed for Zarr, but couldn't find that discussion...

@kylebarron
Copy link

Yes, it's much faster to render lower zoom levels when you have overviews, so that you can read less data instead of downsampling from full-resolution data. I've also seen discussions about multi-resolution Zarr datasets, but I also don't know where that was

@rsignell-usgs
Copy link
Contributor

Found a good entry point into the zarr tile-server discussion with this comment from @rabernat: zarr-developers/community#37 (comment)

@benbovy
Copy link
Contributor

benbovy commented Jun 24, 2021

@TomAugspurger I've just had a look at xstac after reading your post on Pangeo's Discourse. I think it would be great to have a STAC API router built in xpublish! It could be an optional router... You already implemented pydantic models in xstac so it should be pretty straightforward to eventually integrate it in xpublish.

@TomAugspurger
Copy link

Thanks @benbovy. I've been trying to work out what exactly the relationship between STAC, Zarr, and xpublish would be. My (uninformed) hypothesis was that xpublish could be used to serve STAC Items (a chunk of an Xarray dataset, both the data from the Zarr chunk + the coordinates). If you have any thoughts on what use-cases this would serve and how it could be done I'd love to hear them.

@benbovy
Copy link
Contributor

benbovy commented Jul 22, 2021

@TomAugspurger I don't have specific use-cases in mind yet, lately I've just been following the development of STAC specs and related tools with great interest.

I think that xpublish could be used to serve an Xarray Dataset using either STAC Item-level or Collection-level asset(s), as you explain on Pangeo's Discourse. Or both?

I mainly see xpublish as a "Swiss Army Knife", flexible backend solution where any data source supported by Xarray (NetCDF, Zarr, GRIB, etc.) could be dynamically served via one or more standardized APIs (STAC, WMS, Zarr, etc.) "just" by doing

import xarray as xr
import xpublish

ds = xr.load_dataset(...)
ds.rest.serve()

The served data chunks (+ metadata) may also be dynamically generated independently of the original data chunks (if any) so it could fit a broader range of front-end applications, even though this wouldn't be optimal in all cases.

This would be a useful tool, complementary to statically generated data catalogs.

@abuddenb
Copy link

abuddenb commented Dec 9, 2022

@rsignell-usgs I think this is a good idea.

@jhamman
Copy link
Contributor Author

jhamman commented Jul 30, 2024

This issue has spawned a lot of development, including the App Router Plugin API and a growing list of plugins:

and more in development.

@jhamman jhamman closed this as completed Jul 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

9 participants