-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Update rasterio backend to store CRS/nodata information in standard locations. #2308
Comments
@snowman2 This should mean that |
I don't think so. The |
I wouldn't expect it to add |
That is strange. Was the variable put in |
This is the output of using xarray with a standard GOES-16 ABI L1B data file: In [2]: import xarray as xr
In [3]: nc = xr.open_dataset('OR_ABI-L1b-RadF-M3C01_G16_s20181741200454_e20181741211221_c20181741211264.nc')
In [4]: nc.data_vars['Rad']
Out[4]:
<xarray.DataArray 'Rad' (y: 10848, x: 10848)>
array([[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
...,
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan]], dtype=float32)
Coordinates:
t datetime64[ns] ...
* y (y) float32 0.151858 0.15183 0.151802 0.151774 0.151746 ...
* x (x) float32 -0.151858 -0.15183 -0.151802 -0.151774 -0.151746 ...
y_image float32 ...
x_image float32 ...
Attributes:
long_name: ABI L1b Radiances
standard_name: toa_outgoing_radiance_per_unit_wavelength
sensor_band_bit_depth: 10
valid_range: [ 0 1022]
units: W m-2 sr-1 um-1
resolution: y: 0.000028 rad x: 0.000028 rad
grid_mapping: goes_imager_projection
cell_methods: t: point area: point
ancillary_variables: DQF
In [5]: nc.data_vars['goes_imager_projection']
Out[5]:
<xarray.DataArray 'goes_imager_projection' ()>
array(-2147483647, dtype=int32)
Coordinates:
t datetime64[ns] ...
y_image float32 ...
x_image float32 ...
Attributes:
long_name: GOES-R ABI fixed grid projection
grid_mapping_name: geostationary
perspective_point_height: 35786023.0
semi_major_axis: 6378137.0
semi_minor_axis: 6356752.31414
inverse_flattening: 298.2572221
latitude_of_projection_origin: 0.0
longitude_of_projection_origin: -75.0
sweep_angle_axis: x |
Again, I'm yet to be convinced that this logic should live in xarray, regardless if netCDF or rasterio is used as a backend. The job of xarray is to read the rasterio files in a "xarray way" (e.g. in order to leverage dask) and exposing the file attributes to the user, ideally with semantics/attribute names which are as close as possible as the underlying data model (rasterio). All the logic you describe could live in a dedicated library which would become a wrapper around xarray's |
@fmaussion I completely agree except now that all of this is being brought up I see why it may have been better to put the 'crs' in the coordinates of the DataArray returned by |
@fmaussion, I guess the main reason for the proposed change is to be able to read in the raster using Also, since: "xarray.Dataset is an in-memory representation of a netCDF file", to me it makes more sense to store the data in a standard netCDF location that is consistent regardless of the backend. In doing so, other parts of xarray and libraries build off of xarray won't be required to find the same data in multiple locations. Thanks for taking the time to consider this idea. I have used it quite a bit and it has made the datasets compatible across the various GIS tools I use (rasterio, GDAL, QGIS). It has definitely made my life easier and so I wanted to share what I have learned. But, it is just an idea. I really appreciate the work of the xarray developers and whatever y'all decide is fine with me. |
Feel free to re-open if this becomes something of interest. Thanks! |
Problem description
Currently the way data is stored in the dataaray when using
xarray.open_rasterio
the crs and nodata information is stored in an attributes. It would be nice to be able to have them stored in standard locations so that other tools (rasterio, QGIS, GDAL) can find the information properly after dumping to a file withto_netcdf()
.Proposed solutions
The nodata should be loaded into
_FillValue
I propose that the CRS information be stored using the CF
spatial_ref
convention as it is supported by the main open source GIS tools. To do so, you add thecrs
coordinate to the dataset/dataarray. And then, you add thespatial_ref
attribute to thecrs
which is stored as a crs WKT string. Next, you add thegrid_mapping
attribute to all associated variables that contains the coordinate namecrs
as thegrid_mapping
.Here is an example of how it would look on a dataset:
Here is how the
crs
orspatial_ref
coodinate variable would look:And here is how it would look on the variables:
More information about this is in #2288.
The text was updated successfully, but these errors were encountered: