Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dataset Grid not understood #100

Open
eotp opened this issue May 11, 2018 · 27 comments
Open

dataset Grid not understood #100

eotp opened this issue May 11, 2018 · 27 comments

Comments

@eotp
Copy link

eotp commented May 11, 2018

I am trying to apply the .subset function on a xarray.core.dataset.Dataset object. However, I get RuntimeError: dataset Grid not understood. I am able to reproduce the example from the documentation (here), however, when providing another data set I get the error.

Here is a snippet of my data: test_data.zip

Here my code:

import xarray as xr
import salem

ds = xr.open_dataset("./data/test_data.nc")
ds
<xarray.Dataset>
Dimensions:                 (nlat: 85, nlon: 63, time: 3)
Coordinates:
  * nlon                    (nlon) float32 -82.375 -82.125 -81.875 -81.625 ...
  * nlat                    (nlat) float32 -20.125 -19.875 -19.625 -19.375 ...
  * time                    (time) datetime64[ns] 2004-12-31 2005-01-31 ...
Data variables:
    precipitation           (time, nlat, nlon) float32 ...
    relativeError           (time, nlat, nlon) float32 ...
    gaugeRelativeWeighting  (time, nlat, nlon) int32 ...
Attributes:
    Grid.GridHeader:  BinMethod=ARITHMETIC_MEAN;\nRegistration=CENTER;\nLatit...
    FileHeader:       AlgorithmID=3B43;\nAlgorithmVersion=3B43_7.0;\nFileName...
    FileInfo:         DataFormatVersion=m;\nTKCodeBuildVersion=1;\nMetadataVe...
    GridHeader:       BinMethod=ARITHMETIC_MEAN;\nRegistration=CENTER;\nLatit...
    history:          2018-05-08 18:27:22 GMT Hyrax-1.13.4 https://disc2.gesd...
type(ds)
xarray.core.dataset.Dataset
ds_subset = ds.salem.subset(corners=((-80, -15.), (-75., -7.)), crs=salem.wgs84)
RuntimeError: dataset Grid not understood.

Any advice?

@fmaussion
Copy link
Owner

The name of your lon/lat coordinates wasn't in the list of accepted coords.

Can you try latest master and let me know if that works?

pip install git+https://github.com/fmaussion/salem.git

@Irene-GM
Copy link

Hi,

I installed salem with the current master branch as per today, but I am also facing the same Grid not understood error.

To give some context: I have a set of raster files in Lambert Conformal Conic (LCC) coordinate reference system. Since each file is large and has many bands, I am interested in cropping the extent to only my study area to reduce the size of the entire collection. I am really interested in using salem for this task, because it seems that it ships with a subsetting utility salem.subset() which allows to specify the coordinates of the bounding box (and all the transformations/warping that I tried with gdal failed). However, I also hit the same error as the user that opened this issue. I am able to open the file with xarray, and even I can open it with GDAL and plot it with matplotlib, so I was assuming its a well-formed file.

Here I am pasting the netcdf summary after opening it with xarray.

<xarray.Dataset>
Dimensions:            (bnds: 2, time: 745, x: 217, y: 234)
Coordinates:
  * y                  (y) float32 0.0 2500.0 5000.0 7500.0 10000.0 12500.0 ...
  * x                  (x) float32 0.0 2500.0 5000.0 7500.0 10000.0 12500.0 ...
  * time               (time) datetime64[ns] 2013-12-01 2013-12-01T00:30:00 ...
    lon                (y, x) float32 ...
    lat                (y, x) float32 ...
    height             float32 ...
Dimensions without coordinates: bnds
Data variables:
    Lambert_Conformal  |S1 ...
    date               (time) int32 ...
    hms                (time) int32 ...
    time_bnds          (time, bnds) float64 ...
    ugs                (time, y, x) float32 ...
Attributes:
    Conventions:       CF-1.4
    institute_id:      ***** 
    model_id:          ******
    experiment_id:     ******** 
    domain:            NETHERLANDS.NL
    frequency:         
    driving_model_id:  ****
    creation_date:     Tue Feb 13 06:51:45 2018
    title:             *****
    comment:           Created with gl/xtool

Please, is there anything I can do to fix this grid problem?
Thanks!

@fmaussion
Copy link
Owner

Hi @Irene-GM , thanks for the report.

In order for salem to understand the grid of your file, it must be able to parse the projection information out of the file's metadata. Your file seems quite well defined, so it should be possible. Would you mind sharing a small temporal subset out of it with me?

Alternatively, you could define the projection yourself and and give it at runtime as explained in the docs: https://salem.readthedocs.io/en/latest/xarray_acc.html#custom-projection

@fmaussion fmaussion reopened this Sep 27, 2018
@Irene-GM
Copy link

Hi @fmaussion , thanks for your quick response!

Unfortunately, i cannot share the file with you, since it is experimental data. I tried the trick you suggested in the link and it seems it is not working:

ds = xr.open_dataset(path_in_nc)
ds.attrs['pyproj_srs'] = '+proj=lcc +lat_1=51.967 +lat_0=51.967 +lon_0=4.9 +k_0=1 +x_0=0 +y_0=0 +ellps=WGS84 +units=m +no_defs'

# Since I have many bands, I am extracting only one
ugs = ds.ugs.isel(time=225) 

# Shows in matplotlib, FYI
ugs.salem.quick_map(varname="ugs")

Note that :

  • This netcdf file does not have by default this "pyproj_srs", that is why I found here the SRS of Lambert Conformal Conic 1SP and added this parameter manually

  • Regarding the "lat_1", "lat_0", and "lon_0" parameters of the LCC-1SP SRS, I checked the metadata in QGIS and updated the parameters accordingly. I hope the bounds are correctly defined, but I am not really sure (if you spot anything poorly defined just let me know)

Lambert_Conformal#grid_mapping_name=lambert_conformal_conic
Lambert_Conformal#latitude_of_projection_origin=51.967
Lambert_Conformal#longitude_of_central_meridian=4.9
Lambert_Conformal#standard_parallel=52.5

So I keep getting the RuntimeError: dataset Grid not understood raised by the program salem/sio.py

Hope this helps,
Thanks!

@fmaussion
Copy link
Owner

What happens if you do:

ds = xr.open_dataset(path_in_nc)

# Since I have many bands, I am extracting only one
ugs = ds.ugs.isel(time=225) 
ugs.attrs['pyproj_srs'] = '+proj=lcc +lat_1=51.967 +lat_0=51.967 +lon_0=4.9 +k_0=1 +x_0=0 +y_0=0 +ellps=WGS84 +units=m +no_defs'

# Shows in matplotlib, FYI
ugs.salem.quick_map(varname="ugs")

?

@Irene-GM
Copy link

Yes! It worked!

Now I only need to figure out why the raster is shifted with respect to the outline of Europe. I guess it has to do with the lat/lon parameters specified in the SRS, so I should find the correct coordinates of the bounding box in LCC-1SP (any suggestion is welcome, btw). Then I can try the subsetting.

Thanks a lot for finding this workaround for me, much appreciated :-)

@fmaussion
Copy link
Owner

fmaussion commented Sep 27, 2018

For the attribute setting problem, see: https://salem.readthedocs.io/en/latest/xarray_acc.html#keeping-track-of-attributes Basically you are going to have to keep track of the attributes, a behavior I would like to change in a future version of salem (#62)

Now regarding the shift, there are two things you could look at:

  • the projection parameters themselves
  • the x, y coordinates, which are understood by salem as being easting/northings in the model projection. They look OK in your file, although I am a little surprised they start at 0.

In order to check that salem understands your data well, there is a very simple check:

import numpy as np
salem_lon, salem_lat = ugs.salem.grid.ll_coordinates
# Check that they are the same
np.testing.assert_almost_equal(salem_lon, ugs.lon, atol=1e-4)
np.testing.assert_almost_equal(salem_lat, ugs.lat, atol=1e-4)

The lon/lat coordinates should be similar at the 3rd or 4th decimal

@fmaussion
Copy link
Owner

If you want me to have a look, please share your data (modified if you want). For example:

ds = xr.open_dataset(path_in_nc)
ds = ds[['lon', 'lat']]
ds.to_netcdf('simplified_data.nc')

would be enough for me and won't hold any sensitive data other than the coordinates...

@Craftflow
Copy link

Craftflow commented Jan 4, 2019

Hello

I am also facing the Grid not understood error. I am using a netcdf file of weather forecasts. I subset my file to only precipitation observations and attempt to set the projection as suggested above. However my script is still failing with the error.

I would include the file but even after compression it is 160mb. I would subset and re-save the file but I am uncertain whether my subsetting may be causing the problem.

Additional note, each file I have is only for one hour of forecasts so the time dimension is purposefully absent.

Here is the xarray output:

`
ds = xr.open_dataset(cdf_file, decode_coords=True)

ds
<xarray.DataArray 'TMP_P0_L1_GLC0' (ygrid_0: 1059, xgrid_0: 1799)>
[1905141 values with dtype=float32]
Coordinates:
gridlat_0 (ygrid_0, xgrid_0) float32 ...
gridlon_0 (ygrid_0, xgrid_0) float32 ...
Dimensions without coordinates: ygrid_0, xgrid_0
Attributes:
initial_time: 09/20/2014 (06:00)
forecast_time_units: hours
forecast_time: 1
level: 0.0
level_type: Ground or water surface
parameter_template_discipline_category_number: [0 0 0 0]
parameter_discipline_and_category: Meteorological products, ...
grid_type: Lambert Conformal can be ...
units: K
long_name: Temperature
production_status: Operational products
center: US National Weather Servi...
pyproj_srs: +proj=lcc +lat_1=38.2 +la...
`

Here is the code that produced the error.

`
ds_srs = '+proj=lcc +lat_1=38.2 +lat_0=38.2 +lon_0=-121 +k_0=1 +x_0=0 +y_0=0 +ellps=WGS84 +units=m +no_defs'

ds_sub = ds['TMP_P0_L1_GLC0']
ds_sub.attrs['pyproj_srs'] = ds_srs
ds_sub.salem.quick_map()
`

I would appreciate any insight you may have. Thank you!

@fmaussion
Copy link
Owner

Hi,

one of the reasons salem can't read your data is that the dimension and coordinate names are not standard (gridlat_0, why a 0?), i.e. I would have to hardcode these names in salem so that it can understand it (or you'd have to rename them to more standard names using xarray beforehand). This would be OK, but the actual problem with your data is following:

Dimensions without coordinates: ygrid_0, xgrid_0

Salem works only with rectilinear grids in the projection space, i.e. it needs the northings and eastings of your grid. Currently you only have the 2D lons and lats of your grid -> with these you can plot things on a map with cartopy or resample them with xesmf, but you can't use salem.

I have written more info about these things scattered on github (pangeo-data/pangeo#356 (comment) or geoxarray/geoxarray#3) but I will write them down in the salem documentation for future reference

@Craftflow
Copy link

Thank you for the quick response!

I am also interested in why the grid lat and longs are coded in a non-standard fashion. I have made no changes to the dataset. In fact there appears to be a couple non-standard aspects to this file that are causing me problems. However, with the current government shutdown my source for this information is unavailable.

I am sorry for my lack of experience in working with GIS data, such that I neglected to provide adequate context and information. If my understanding of your response is correct then I do believe this dataset has the northings and eastings, i just failed to include them earlier as they were not understood by xarray and so not reported.

The below information comes from a simple ncdump command output.

'
dimensions:
ygrid_0 = 1059 ;
xgrid_0 = 1799 ;

Corner Coordinates: #from the xgrid and ygrid dimensions.
Upper Left ( 0.0, 0.0)
Lower Left ( 0.0, 1059.0)
Upper Right ( 1799.0, 0.0)
Lower Right ( 1799.0, 1059.0)
Center ( 899.5, 529.5)

Latitude Longitude of Grid Corners:
gridlat_0:corners = 21.138f, 21.14067f, 47.84238f, 47.83844f
gridlon_0:corners = -122.72f, -72.29019f, -60.91784f, -134.0961f

Geolocation:
LINE_OFFSET=0
LINE_STEP=1
PIXEL_OFFSET=0
PIXEL_STEP=1
SRS=GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4326"]]
X_BAND=1
X_DATASET=NETCDF:"1426306000001.nc":gridlon_0
Y_BAND=1
Y_DATASET=NETCDF:"1426306000001.nc":gridlat_0
`

Additionally, I know that each gridpoint is exactly 3km by 3km. I also included the geolocation information as I am unsure whether it may be relevant to this discussion.

Given this information does it now make sense to try to utilize Salem?

@hydrogeohc
Copy link

I have the same issue. Could I ask you @fmaussion to check my data?

@sbr-5260
Copy link

I am also having the "dataset grid not understood" error.

I have many time dimensions, and I am trying to create a composite map of the PBLH height, using this code. As far as I can see, the coordinates have remained the same from the original dataset, so I am not sure why I am getting this error.

twentyfour = ds.PBLH.groupby('time.hour').mean().isel(hour = 23)

<xarray.DataArray 'PBLH' (south_north: 5, west_east: 5)>
dask.array<getitem, shape=(5, 5), dtype=float32, chunksize=(5, 5), chunktype=numpy.ndarray>
Coordinates:
lat (south_north, west_east) float32 -37.14 -37.14 ... -37.13
lon (south_north, west_east) float32 174.5 174.5 ... 174.5 174.5

  • west_east (west_east) float64 -2.595e+04 -2.565e+04 ... -2.475e+04
  • south_north (south_north) float64 -2.145e+04 -2.115e+04 ... -2.025e+04
    keep_attrs bool True
    hour int64 23

@fmaussion
Copy link
Owner

@sbr-5260 how did you create ds? it's likely that the ds.PBLH.groupby('time.hour').mean().isel(hour = 23) operation is discarding the attributes that salem needs to construct the grid.

@sbr-5260
Copy link

Hi fmaussion,

Here is my ds dataset:

<xarray.Dataset>
Dimensions: (south_north: 168, west_east: 168, time: 109, bottom_top: 40, soil_layers: 4)
Coordinates:
lat (south_north, west_east) float32 -37.14 -37.14 ... -36.67
lon (south_north, west_east) float32 174.5 174.5 ... 175.1 175.1

  • time (time) datetime64[ns] 2019-06-26T12:00:00 ... 2019-07-01
  • west_east (west_east) float64 -2.595e+04 -2.565e+04 ... 2.415e+04
  • south_north (south_north) float64 -2.145e+04 -2.115e+04 ... 2.865e+04
    keep_attrs bool True
    Dimensions without coordinates: bottom_top, soil_layers
    Data variables: (12/145)
    LU_INDEX (time, south_north, west_east) float32 dask.array<chunksize=(1, 168, 168), meta=np.ndarray>
    ZNU (time, bottom_top) float32 dask.array<chunksize=(1, 40), meta=np.ndarray>
    ZNW (time, bottom_top) float32 dask.array<chunksize=(1, 40), meta=np.ndarray>
    ZS (time, soil_layers) float32 dask.array<chunksize=(1, 4), meta=np.ndarray>
    DZS (time, soil_layers) float32 dask.array<chunksize=(1, 4), meta=np.ndarray>
    VAR_SSO (time, south_north, west_east) float32 dask.array<chunksize=(1, 168, 168), meta=np.ndarray>
    ... ...
    pressure (time, bottom_top, south_north, west_east) float32 dask.array<chunksize=(1, 40, 168, 168), meta=np.ndarray>
    temperature (time, bottom_top, south_north, west_east) float32 dask.array<chunksize=(1, 40, 168, 168), meta=np.ndarray>
    wind_speed (time, bottom_top, south_north, west_east) float32 dask.array<chunksize=(1, 40, 168, 168), meta=np.ndarray>
    wind_speed_10m (time, south_north, west_east) float32 dask.array<chunksize=(1, 168, 168), meta=np.ndarray>
    wind_dir_10m (time, south_north, west_east) float32 dask.array<chunksize=(1, 168, 168), meta=np.ndarray>
    wind_dir (time, bottom_top, south_north, west_east) float32 dask.array<chunksize=(1, 40, 168, 168), meta=np.ndarray>
    Attributes: (12/87)
    TITLE: OUTPUT FROM WRF V4.2.1 MODEL
    START_DATE: 2019-06-25_12:00:00
    SIMULATION_START_DATE: 2019-06-25_12:00:00
    WEST-EAST_GRID_DIMENSION: 169
    SOUTH-NORTH_GRID_DIMENSION: 169
    BOTTOM-TOP_GRID_DIMENSION: 41
    ... ...
    ISICE: 22
    ISURBAN: 19
    ISOILWATER: 14
    HYBRID_OPT: 2
    ETAC: 0.2
    pyproj_srs: +proj=lcc +lat_0=-36.9370002746582 +lon_...

Which of the attributes are needed for constructing the grid? Maybe I could add them back in after the operation.

Or open to alternate methods if you have and suggestions!

Thanks :)

@fmaussion
Copy link
Owner

@sbr-5260 how did you create ds?

@fmaussion
Copy link
Owner

@sbr-5260
Copy link

I am using the salem.open_mf_wrf_dataset to get the dataset, and the dataset "ds" plots fine until my function-- looks like its the lost attributes that are causing it.

Thanks so much! I just need to figure out how to save them or add them back in then.

@fmaussion
Copy link
Owner

pyproj_srs is the attribute you want to keep and/or copy after operations.

@sbr-5260
Copy link

Got it working! Thanks :)

@singledoggy
Copy link

Do you have a plan to understand CF projection data?It seems weird to set the projection in code other than files for me. Or did it already supported but I missed it?

@fmaussion
Copy link
Owner

When salem was developed there was no system to interpret CF conventions, and even less files using them ;-). For example WRF is still not using them AFAIK.

salem is on "maintainance mode" since a few years now and I don't really think there is much reason to develop it further given all the great tools around (metpy, cartopy, etc). But if someone wants to spend some time on it I am all for it! I am still using salem regularly and I think others do.

@weirdoyang721
Copy link

weirdoyang721 commented Oct 11, 2022

My problem is same as that raised by Irene-GM. Here's my data created by salem.open_xr_datasat and pr = ds['variable'].isel(time=0).

<xarray.DataArray 'precip' (height: 1, rlat: 240, rlon: 262)>
array([[[6.673193e-05, 1.738828e-04, ..., 7.269982e-05, 6.908291e-05],
        [2.103231e-04, 2.199079e-04, ..., 7.857729e-05, 7.541250e-05],
        ...,
        [2.441412e-05, 2.839271e-05, ..., 3.255216e-06, 3.436061e-06],
        [2.396200e-05, 2.269609e-05, ..., 3.255216e-06, 3.345638e-06]]],
      dtype=float32)
Coordinates:
    time     datetime64[ns] 1979-01-01
  * height   (height) float64 0.0
  * rlat     (rlat) float64 -30.0 -29.75 -29.5 -29.25 ... 29.0 29.25 29.5 29.75
  * rlon     (rlon) float64 -32.75 -32.5 -32.25 -32.0 ... 31.75 32.0 32.25 32.5
    lat      (rlat, rlon) float64 -46.75 -46.92 -47.09 ... -47.42 -47.24 -47.07
    lon      (rlat, rlon) float64 -126.9 -127.1 -127.3 ... 52.84 53.03 53.23
Attributes:
    standard_name:  precipitation_flux
    long_name:      Total Precipitative Flux
    units:          kg m-2 s-1
    cell_methods:   time: mean (comment: 24-hr averaged values)
    grid_mapping:   rotated_latitude_longitude
    pyproj_srs:     +proj=longlat +datum=WGS84 +no_defs

Context: for the first running, the error "dataset Grid not undersantand" came out, then I add "rlon" and "rlat" to the valid_names list in salem/utils.py. After that, the code works fine.
However, I may have done something wrong so that salem cannot understand the data grid. pr.salem.roi(shape=shp) gives an array full of NaN, and pr.salem.quick_map gives a figure likes this
image
I can see the the outlines of Antarctica (which is the domain of the shapefile), but the extent of the map is created from rlat and rlon.

I also tried this, the result shows that the two are inconsistent.

salem_lon, salem_lat = pr.salem.grid.ll_coordinates
np.testing.assert_almost_equal(salem_lon,pr.lon, atol=1e-4)
np.testing.assert_almost_equal(salem_lat, pr.lat, atol=1e-4)

Any suggestions would be appreciated, thanks!

@fmaussion
Copy link
Owner

@weirdoyang721 your dataset is probably in rotated pole projection. You need to tell salem how to interpret this. One example of implementation for the MetUM model can be found here: https://github.com/fmaussion/salem/blob/master/salem/sio.py#L1042

@weirdoyang721
Copy link

Hi @fmaussion , yes, the dataset is in rotated pole projection. Here's the information of the rotated map:

In   [1]:ds['rotated_latitude_longitude']
Out[1]: 
<xarray.DataArray 'rotated_latitude_longitude' ()>
array(-2147483647)
Attributes:
    grid_mapping_name:          rotated_latitude_longitude
    grid_north_pole_latitude:   -180.0
    grid_north_pole_longitude:  -170.0
    north_pole_grid_longitude:  0.0

I tried to add a function def open_racmo_dataset followed that for MetUM and alter the srs as:

    srs = ('+ellps=WGS84 +proj=ob_tran +o_proj=latlon '
           '+o_lat_p={o_lat_p} +o_lon_p={o_lon_p} +lon_0={lon_0}')

and the params:

    params = {
        'o_lon_p': pole_longitude,
        'o_lat_p': pole_latitude,
        'lon_0': central_rotated_longitude,
    }

comes out pyproj_srs = '+ellps=WGS84 +proj=ob_tran +o_proj=latlon +o_lat_p=-180.0 +o_lon_p=-170.0 +lon_0=0.0'
Run code:

ds = salem.open_racmo_dataset(r"file.nc")
ds_test=ds['precip'].isel(time=0).squeeze()
ds_teat.salem.quick_map()

error dataset grid not understans still prompted. I'm a layman to GIS, I'm not sure the manually added srs is correct. Is the error caused by the undefined projection?

In addition, if runds_test=ds['precip'].isel(time=0).resample(time='1Y').sum(), the attribute pyproj_srs could not be traced. However, if open the data with salem.open_dataset, the attribute is not lost in calculations.

Hope for your help! Thanks :)

@fmaussion
Copy link
Owner

In addition, if runds_test=ds['precip'].isel(time=0).resample(time='1Y').sum(), the attribute pyproj_srs could not be traced. However, if open the data with salem.open_dataset, the attribute is not lost in calculations

Yes, this is a known issue in salem. You need to use keep_attrs=True in xarray operations, until we implement a new data model for the CRS: https://salem.readthedocs.io/en/stable/xarray_acc.html#keeping-track-of-attributes

Regarding your other issue I'm afraid I can't help much. Maybe plotting with cartopy would be an option?

@singledoggy
Copy link

singledoggy commented Dec 15, 2023

It seems that the ds = xr.open_mfdataset(preprocess=fun) and salem.open_mf_wrf_dataset(preprocess=fun) argument also causes the loss of attributes. I didn't realize this until recently. I'm commenting here to prevent others from encountering the same issue and not knowing how to troubleshoot it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants