-
-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Visualizing cloud-optimized geotiffs #370
Comments
@jbednar, what would it take to get this on the docket? |
I'm not sure. From pangeo-data/pangeo-example-notebooks#21 it sounds like there has already been some effort at improving access to COGs for xarray usage? Is any of that simply usable with geoviews already, i.e. to use rasterio/gdal/etc. to make a lazy xarray that GeoViews can use without further work? I haven't looked closely at what's been done and what remains... |
@jbednar and @rsignell-usgs the linked issue above is one of many where we've speculated on the ability to make xarray+dask+holoviews more efficient working with COGs . there hasn't been a lot of concentrated effort. I think a on-the-fly WMTS layer is a special case, and there are some lower-hanging fruit improvements that could be made that i'll try to clarify below. Consider a single COG (maybe we open a separate issue in holoviews/hvplot): import os, rioxarray, hvplot.xarray
os.environ['GDAL_DISABLE_READDIR_ON_OPEN']='EMPTY_DIR'
os.environ['AWS_NO_SIGN_REQUEST']='YES'
url = 'https://sentinel-s1-rtc-indigo.s3.us-west-2.amazonaws.com/tiles/RTC/1/IW/10/T/ET/2020/S1B_20200106_10TET_ASC/Gamma0_VV.tif'
da = rioxarray.open_rasterio(url)
da.hvplot.image(rasterize=True, cmap='gray', clim=(0,0.4)) This is amazing functionality because as we zoom into the plot we get updating resolution on-the-fly. I could be wrong, but I believe the entire .tif is read into memory and then rasterized according to zoom level? So effectively running something like da = rioxarray.open_rasterio(url, overview_level=3)
da.hvplot.image(rasterize=True, cmap='gray', clim=(0,0.4)) It's a complicated stack of pieces coming together here. Ultimately the TIF is read with rasterio.open() and GDAL/libtiff which is clever about only reading from overviews depending on the read request. This code block is equivalent to the above, returning a numpy array. import rasterio
with rasterio.open(url) as src:
numpy_array = src.read(out_shape=(src.height//16, src.width//16)) Information on this particular test file: with rasterio.open(url) as src:
print(src.profile)
overview_factors = [src.overviews(i) for i in src.indexes][0]
overview_levels = list(range(len(overview_factors)))
print('Overview levels: ', overview_levels)
print('Overview factors: ', overview_factors)
In summary, it seems like there is room to improve the rasterize=True efficiency when overviews are present in the underlaying data. I can think of a few approaches:
|
Is your feature request related to a problem? Please describe.
Cloud-Optimized Geotiffs are quickly becoming a community standard for storing geospatial raster data on the Cloud. These COGs store the data in small tiles, and have built-in overviews, which makes extracting data fast (especially when using parallel processing). And the cogeo tools make it easy to create and validate COGs.
There is a AWS-lambda service that can create a WMTS tile service on-the-fly from any COG on object storage, and this works great in geoviews: https://nbviewer.jupyter.org/gist/jsignell/9980e640536180b44fb59e7812208e27/cog_tiling_small_panel.ipynb
But it would be way cooler for geoviews to be able to extract the chunks a render them in parallel using dask, so that on-the-fly service wouldn't even be needed.
Describe the solution you'd like
I'd like some tools to extract and visualize the correct chunks from the most appropriate overviews in a COG, using Dask to parallelize the effort. This would allow us to use geoviews to effectively and interactively browse any COG to full resolution in the browser without additional services.
Describe alternatives you've considered
We can use the "on-the-fly" WMTS service, but someone needs to run this and it also costs $$.
It would be way cooler to have this functionality native to geoviews!
The text was updated successfully, but these errors were encountered: