Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

let's enumerate all the ways to represent a "grid" in python #356

Closed
rabernat opened this issue Aug 14, 2018 · 79 comments
Closed

let's enumerate all the ways to represent a "grid" in python #356

rabernat opened this issue Aug 14, 2018 · 79 comments
Labels

Comments

@rabernat
Copy link
Member

As discussed in the Pangeo meeting, there are currently many different models for a grid in python. Some of these are explicit, some are implicit in how packages work. I am talking about both structured and unstructured grids.

Let's try to put together a list of what these are and how they differ. For now, I will just throw out some links:

What else should be on this list?

cc @dopplershift, @bekozi, @rsignell-usgs

@rabernat
Copy link
Member Author

@fmaussion
Copy link
Member

I'll add "my" Grid object to the list. Note that this is not an endorsement ;-)

http://salem.readthedocs.io/en/latest/gis.html

@mnlevy1981
Copy link

It sounds like you guys want a python class that has some method objects to act on the grid rather than simply an xarray schema to use, so this may not be the most helpful... but CESM wants grids in the SCRIP format: that's a netcdf file with very specific fields. From a very old version of the User Guide (Figure 2.2.2 on page 8)

netcdf remap_grid_T42 {
dimensions:
        grid_size = 8192 ;
        grid_corners = 4 ;
        grid_rank = 2 ;
variables:
        long grid_dims(grid_rank) ;
        double grid_center_lat(grid_size) ;
                grid_center_lat:units = "radians" ;
        double grid_center_lon(grid_size) ;
                grid_center_lon:units = "radians" ;
        long grid_imask(grid_size) ;
                grid_imask:units = "unitless" ;
        double grid_corner_lat(grid_size, grid_corners) ;
                grid_corner_lat:units = "radians" ;
        double grid_corner_lon(grid_size, grid_corners) ;
                grid_corner_lon:units = "radians" ;
// global attributes:
                :title = "T42 Gaussian Grid" ;
}

At some point a grid_area field was added -- ESMF mapping tools can be used with SCRIP grid files, and the area field is necessary for conservative remapping. Anyway, I would posit that xr.open_dataset([SCRIP FILE]) is a way to represent a grid in python.

Interestingly enough, SCRIP was originally a full suite of Fortran-based grid tools (SCRIP stands for Spherical Coordinate Remapping and Interpolation Package) though the package itself is no longer supported... presumably because ESMF does everything it used to do.

@fmaussion
Copy link
Member

@djhoese
Copy link
Contributor

djhoese commented Aug 15, 2018

@fmaussion Thanks for the mention. So many good projects listed here that I didn't know about. For everyone else, I am one of the maintainers of the pyresample package which is very similar to some of these other libraries and focuses on resampling large satellite data arrays to new projections (used by the satpy library.

Pyresample has a couple different geometry objects and some are subclasses of others so I'll just list the two most important and most used:

  • SwathDefinition(lons, lats): Unstructured (non-uniform spacing) lon/lat arrays to represent satellite swath data like those from polar-orbitting satellite instruments (VIIRS, MODIS, etc).
  • AreaDefinition (name, proj4_dict, rows, cols, lower left coordinate, upper right coordinate): A projected and uniformly spaced grid of pixels. The datum and projection information is all specified by a dictionary of PROJ.4 parameters. The location of the grid is specified by the extents (our edges of the out edge pixels) and the resolution is determined by how many pixels are specified with the rows and cols parameter. There are other parameters but they aren't useful here.

I should also point out a similar issue to this on the pyresample repository: pytroll/pyresample#57
Where it was discussed how users might want to specify their grids/areas. I have an undergrad student employee working on this currently for in-python specification and on-disk specification in a yaml file. This includes things like specifying them with a center point and a radius for scientific users who are used to specifying regions of interest.

For anyone who hasn't been linked to it already this type of information may be useful to the geoxarray project I'm starting (pydata/xarray#2288, https://github.com/geoxarray/geoxarray). I had been considering duplicating some of the work done by pyresample's AreaDefinition and SwathDefinition objects in geoxarray, but don't have anything implemented yet.

So...what is the end goal of this issue?

@djhoese
Copy link
Contributor

djhoese commented Aug 15, 2018

cc @mraspaud and @leouieda

@rabernat
Copy link
Member Author

So...what is the end goal of this issue?

My goals for now is just to understand the extent of duplication of effort in developing grid models and to start a discussion on whether a more unified approach would be worthwhile. Some of the downsides of having many different implementations for very similar concepts are:

  • wasted developer time
  • incomplete feature sets (because each package currently implements only the things it needs for the application it has in mind)
  • reduced interoperability between packages

As a concrete example of the lack of interoperability, xgcm, my package, has its own representation for finite volume grid cells (cell center, cell face, etc.). @JiaweiZhuang's xesmf package can do conservative regridding, for which it needs to know about the cell extents. Lacking a universally accepted way to describe these grids, Jiawei invented a [completely reasonable] new convention for specifying the grid cell bounds. Consequently, despite ostensibly knowing about the same type of object, xgcm and xesmf are not interoperable. (I am not trying to pick on Jiawei--xesmf is amazing! Examples could have been drawn from any combination of two packages from the above list.)

Reviewing the docs from the list, I think there are two basic categories of things called "grids":

  • grid as a geographical projection: this is kind of a GIS-based view. The point of the grid objects is to allow for georeferencing / resampling. Focused on remote sensing. proj4 usually involved. [salem, pyresample, geoxarray (?), metpy]
  • finite volume grid: this a model (GCM) based view. The grid has "cells" which are polygons (rectangles for structured grids) with finite geographic extents. Variables may live at different points on these cells (e.g. face, center, or corner). [esmf, pycomodo, xesmf, xgcm, ugrid, sgrid, metpy, mpas-analysis]

Just as numeric and numarray decided to merge into numpy, it could be beneficial for our [much smaller] software ecosystem to merge some of the above efforts.

@rabernat
Copy link
Member Author

p.s. Worth noting that definition 2 (finite volume grid) doesn't have to live on the Earth. This type of grid is useful also for models run in cartesian coordinates.

@fmaussion
Copy link
Member

there are two basic categories of things called "grids":

Right, sorry for spaming this issue if you were just talking about category 2 in the first place.

@pbranson
Copy link
Member

pbranson commented Aug 16, 2018

Right, sorry for spaming this issue if you were just talking about category 2 in the first place.

Whilst there may be two different 'world views' on grids (one from the modellers perspective and one from the earth observation perspective) I feel like some of the important scientific questions will be more easily answered if both didnt also succumb to the interoperability issue that @rabernat outlined (which I think is the main point of this issue)

finite volume grid: this a model (GCM) based view. The grid has "cells" which are polygons (rectangles for structured grids) with finite geographic extents. Variables may live at different points on these cells (e.g. face, center, or corner). [esmf, pycomodo, xesmf, xgcm, ugrid, sgrid, metpy, mpas-analysis]

This could be generalised even further to say that the variables may also be defined on grids, adding additional dimensions beyond space and time (i.e. wavelengths of light, surface wave direction and frequency energy bins etc etc), where a lot of the same properties relating the conservative up- and down-sampling still also apply

So ideally the specification and implementation allows for a hierarchical definition of a grid from 1D through to nD?

...maybe too far down the rabbit hole and reminds me of some of the discussions I have read around xarray and the fundamental challenge dealing with nD-data resulting from the distinction between dimensions (array indicies) and coordinates

@djhoese
Copy link
Contributor

djhoese commented Aug 16, 2018

@rabernat The xgcm versus xesmf gridded is interesting because looking at the two packages' documentation on grids I wouldn't immediately assume they were solving the same problem. Looking at the xesmf documentation the examples for gridding seem to be based on lon/lat grids. The xgcm seems like it handles a lot more cases but being unfamiliar with the package it isn't clear to me how to apply it to my remote sensing problems. I like the idea of describing the grids as DataArray objects, however, in my data cases I usually deal with 1000s x 1000s pixel grids (22k x 22k on the larger end) which isn't always represented well with xarray coordinates (not sure if this is considered large to the other things discussed here). In my satpy library (which uses pyresample) we use dask underneath xarray to help with the memory and performance issues that come up with processing these larger arrays. If we store things in .coords then xarray currently has to compute these which can take up memory and time to load information from the data files, depending on what the data is.

With the "geoxarray" package I'm starting I was going to add an xarray accessor (http://xarray.pydata.org/en/stable/internals.html#extending-xarray) to provide a simple way to access the coordinate reference system (CRS). It sounds like anything that comes out of this issue could work well with "geoxarray" where you could have a grid description from some gridding package and add CRS information to it in a way that geoxarray could support.

The brainstorm is starting...

@martindurant
Copy link
Contributor

to provide a simple way to access the coordinate reference system (CRS)

I would request that you also keep in mind astronomical World Coordinate Systems (WCSs), normally stored in the headers of FITS files. I don't know how similar the types of coordinate mappings might be; for astronomy FITS, it is often affine, but could be higher-order polynomials.

@rabernat
Copy link
Member Author

he xgcm versus xesmf gridded is interesting because looking at the two packages' documentation on grids I wouldn't immediately assume they were solving the same problem. Looking at the xesmf documentation the examples for gridding seem to be based on lon/lat grids. The xgcm seems like it handles a lot more cases

This is kind of my point. I implemented the parts of a "grid" that I cared about immediately for my applications (focused on finite differencing and related operations on the native grid space), but have not yet implemented the more "geographical" aspects of grids.

@rabernat
Copy link
Member Author

Oh we should also add all the ways that geoviews handles grids:

cc @philippjfr

@djhoese
Copy link
Contributor

djhoese commented Aug 16, 2018

@martindurant I won't exclude the idea, but I'm also not sure I know the differences needed by the astronomy community. I read a little bit of astropy's wcs documentation, but I'm not sure I have a great grasp on the concepts quite yet.

@rabernat Most of that looks like 2D lon/lat grids with values for each pixel.

I also wanted to point out that in the remote sensing field we typically deal with data as an "area" where one data point represents an area. The other common one is a single point. It is assumed that the coordinate for a pixel represents the center of the pixel/area/cell.

@martindurant
Copy link
Contributor

The astro WCS is basically an object encoding an (arbitrary) function for transforming pixel coords to logical coords and usually also the inverse. In the simplest case, there will be a few keys in the file metadata like reference logical x/y, reference pix x/y, scale, but xarray doesn't need to know how things are actually stored. As was discussed at scipy, the problem comes when the coords can't be expressed as a simple grid of values at each pix location; I expect you are facing exactly the same problem.

@djhoese
Copy link
Contributor

djhoese commented Aug 16, 2018

Was this discussed at a BOF? I missed one I wanted to go to the first day because I had to represent vispy at the visualization BOF. When you say "can't be expressed" do you mean the coordinates have to be specified for each pixel (non-uniform spacing)?

@bekozi
Copy link

bekozi commented Aug 17, 2018

This has been a very interesting discussion so far. The ESMF mesh/grid model can serve as a useful reference point having been built up over years of use cases. This is not to say it is the best user-facing grid model! I expect ESMPy will keep pretty close to ESMF internals with xESMF being the best way to interact with xarray packages. With ESMF, we are working to integrate MOAB as our mesh backend.

With the "geoxarray" package I'm starting I was going to add an xarray accessor to provide a simple way to access the coordinate reference system (CRS).

@djhoese Nice. Are you standardizing on CF convention with PROJ.4 or something? At the workshop, I talked with @dopplershift and @niallrobinson about creating packages similar to cftime addressing the CRS issue and general metadata interpretation. Are you aiming for something similar?

Looking at the xesmf documentation the examples for gridding seem to be based on lon/lat grids.

@djhoese Currently xESMF supports a subset of the features available in ESMPy...

maybe too far down the rabbit hole

@pbranson I don't think so at least. Time-varying grid/meshes have applications in sea level rise modeling for example. It usually the edge cases that break conventions!

xgcm and xesmf are not interoperable [for conservative regridding]

@rabernat & @JiaweiZhuang, is the xESMF/xGCM conservative regridding compatiblity issue related to how corners are handled? I expect xESMF has corners with a (lat+1, lon+1) dimension similar to ESMPy while xGCM maybe does (lat, lon, corner_count). The ESMF corners method was chosen to reduce coordinate duplication, but it is annoying to use. I have some Python functions to move between the two representations that might be helpful. Would xGCM be the right place to add this? Maybe this has long since been addressed...

@djhoese
Copy link
Contributor

djhoese commented Aug 17, 2018

@bekozi It is probably best to skim through the related xarray issue here: pydata/xarray#2288

But yes, the idea would be that you could do my_data_arr.geo.crs and get a CRS object that is dynamically created from the contents of your DataArray/Dataset object. My hope is that it would handle multiple common ways of describing projections including the CF standard way (grid_mapping, etc) and my usual use case of PROJ.4 strings/dicts. With the optional dependency of rasterio I'm sure it could also convert to WKT and other formats. The library would provide CF NetCDF writing too on top of the existing to_netcdf provided by xarray, at least that's the idea. All the other common grid parameters we would need for describing a data grid are expected in the DataArray metadata (x/y pixel center coordinates and array shape).

I'm not sure how much PROJ.4 and CF projection works with astronomy WCS stuff (@martindurant).

Now if only I had the time to write it...

@JiaweiZhuang
Copy link
Member

is the xESMF/xGCM conservative regridding compatiblity issue related to how corners are handled? I expect xESMF has corners with a (lat+1, lon+1) dimension similar to ESMPy while xGCM maybe does (lat, lon, corner_count).

Not sure how xgcm represents corners internally, but (lat+1, lon+1) vs (lat, lon, 4) is indeed an annoying issue (pydata/xarray#1475).

The "incompatibility" is largely because the totally different purposes of the two packages. xgcm has a strong focus on finitely-difference calculations, which requires switching between staggering locations and the connectivity information between (cubed-sphere) panels. It does store corner information nicely, but most of other features are an overkill for xESMF. xESMF just needs a plain (lat+1, lon+1) numpy array to compute cell overlapping for conservative regridding, not much else.

I want to keep the absolutely simple user interface for xESMF and minimize the dependency on advanced data structure (pure numpy arrays should always work). I got many users migrated from MATLAB/NCL/IDL and they just barely know xarray and even numpy. Asking them to learn other fancier packages like xgcm will blow their mind... Maybe the interoperability with other packages can be built on top of the current interface?

@JiaweiZhuang
Copy link
Member

JiaweiZhuang commented Aug 18, 2018

Given the diversity of user needs and the huge cultural gap between different research communities, I feel that it is almost impossible to invent a one-for-all representation of "grid".

The xgcm vs xESMF issue is a rather small cultural gap: Many xESMF users are from the environment & air quality modeling community (my field of study); they only care about conservatively regridding chemical concentration and emissions fields, and never worry about taking spatial derivatives (vorticity, divergence). So they will have little incentive to learn the additional features that xgcm provides.

A much bigger cultural gap is between GIS and numerical modeling. pydata/xarray#2288 is indeed a great proposal from a GIS perspective, but the modeling people mostly just think a "grid" as simple 1D/2D arrays of latitude&longitude. GIS terms like "datum", "geodetic" and even simply "coordinate reference system" can feel quite exotic to atmospheric modeling people. The major reason is that atmospheric models always assume a perfect sphere, so people never worry about the choice of geodetic datum as in GIS (although sometimes they probably should; see Monaghan et al. 2013 and Cao et al. 2017).

So, for the comment:

Looking at the xesmf documentation the examples for gridding seem to be based on lon/lat grids.

Exactly, because xESMF (recall the name Earth System Modeling Framework) is written more for the numerical modeling people. On the contrary, GIS people will probably find xESMF inconvenient, because the grid has to be expressed in true lat/lon values on a perfect sphere.

@rabernat
Copy link
Member Author

Asking them to learn other fancier packages like xgcm will blow their mind

@JiaweiZhuang -- thanks for your response. I guess it was not sufficiently clear from my message that I am not proposing to merge any existing packages tomorrow or force your users to adopt an unfamiliar new tool. xesmf is an amazing piece of software which has already developed a wide following. The last thing I would want would be to undermine this success in any way.

I am mostly concerned here with the internals of these packages, i.e. the code that the developers have to write to represent a grid. So yes, while xgcm and xesmf currently have very different user-facing goals, internally, they both had to create very similar data structures related to the representation of finite volume grid cells. (And so did many of the other packages on my list.)

I agree that distinction between "GIS grids" and "GCM grids" an important one. I think this is a useful outcome of this discussion.

@JiaweiZhuang
Copy link
Member

JiaweiZhuang commented Aug 18, 2018

@rabernat Your clarification is clear and I think we are on the same page (Also, thanks very much for the kind words about xESMF!)

I am mostly concerned here with the internals of these packages

A useful thing would be identifying small, individual, commonly-used functionalities for grid operations, and adding them to xarray's core or a very small extension module. For example:

  1. Validate the continuity of grid coordinate values

Right now it is easy to create an illegal coordinate with discontinuous/non-monotonic values:

dr = xr.DataArray([1,1,1,1], coords={'x': [1,2,4,3]}, dims='x')

It will only throw an error at plotting, not at creation.

Seems like xgcm Grid() can also accept this kind of object.

In [21]: ds  # x_c is not monotonic
Out[21]:
<xarray.Dataset>
Dimensions:  (x_c: 9, x_g: 9)
Coordinates:
  * x_c      (x_c) int64 1 2 3 9 8 7 6 5 4
  * x_g      (x_g) float64 0.5 1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5
Data variables:
    *empty*

In [22]: Grid(ds)
Out[22]: <xgcm.Grid>

Discontinuity in the grid coordinates once led to a bug in xESMF (JiaweiZhuang/xESMF#16). It would be nice to have something like xr.assert_valid_coords().

  1. Validate the consistency between cell centers and corners

For a grid object passed to xESMF, its center coordinates are totally decoupled from its corner coordinates (they are just separate numpy arrays). A consistency checking ("is the mean of the cell corners equal to the cell centers?") is quite useful for preventing data mismatch and for inferring cell boundaries (JiaweiZhuang/xESMF#5 (comment)). Not sure how xgcm ensures data consistency.


Actually, xarray already has _is_monotonic and _infer_interval_breaks in xarray/plot/plot.py that kind of do what I want, although they still need to consider special cases such as curvilinear mesh over the pole. Maybe these internal functions can be polished and made more explicit, like xr.apply_ufunc() that is used both internally and externally.

Those functionalities are pretty general for all grid objects, regardless of actual data presentation. This doesn't address the issue of diverging data structures, but should avoid some duplicated efforts.

@fmaussion
Copy link
Member

A much bigger cultural gap is between GIS and numerical modeling. pydata/xarray#2288 is indeed a great proposal from a GIS perspective, but the modeling people mostly just think a "grid" as simple 1D/2D arrays of latitude&longitude. GIS terms like "datum", "geodetic" and even simply "coordinate reference system" can feel quite exotic to atmospheric modeling people. The major reason is that atmospheric models always assume a perfect sphere, so people never worry about the choice of geodetic datum as in GIS (although sometimes they probably should; see Monaghan et al. 2013 and Cao et al. 2017).

I have to comment on this, because I think this is only partly true. Yes, atmospheric modeling people often do not know what a map projection is, or a "coordinate reference system". But this is causing them a lot of trouble: most (all?) limited area models like WRF do think of the earth as a sphere indeed, but they use a map projection to represent their grid, which is cartesian in eastings-northings (the map projection reference frame) and irregular in lon-lat. People non-aware with CRS will deal with the issue as following:

  • use the 2D lons and lats to plot them on a cartopy map (which is already slightly wrong because it the requires to use the _infer_interval_breaks trick from xarray, which is only an approximation)
  • regrid them to a regular lon-lat grid (maybe using tools like xesmf) in order to make comparisons with other gridded datasets

The much better way to deal with this would be to stay in the original map projection and regrid the other datasets you want to compare your model to (e.g. satellite products) into the original model projection.

I was surprised to see that a blog post I wrote about WRF map projections is the most visited entry on my webpage. I think that many people in the python/xarray ecosystem would benefit from a unified handling of CRS in a tool which will hide all this complexity from them. Combined with a tool like pyresample or xesmf they would have all they need to work in cartesian grids instead of relying on the approximations of lon/lat grids.

@JiaweiZhuang
Copy link
Member

JiaweiZhuang commented Aug 22, 2018

@fmaussion Thanks, I really enjoy your blog post. But I have a somewhat opposite opinion on this.

But this is causing them a lot of trouble: most (all?) limited area models like WRF do think of the earth as a sphere indeed, but they use a map projection to represent their grid, which is cartesian in eastings-northings (the map projection reference frame) and irregular in lon-lat.

The much better way to deal with this would be to stay in the original map projection and regrid the other datasets you want to compare your model to (e.g. satellite products) into the original model projection.

Whether to use true lat/lon or the projected coordinates largely depends on the problem a user wants to solve. For common questions like "what's the temperature at this location", using lat/lon is way more convenient/intuitive because the location of a city or a observation site is mostly expressed in lat/lon. More concretely, the xarray.Dataset should have coordinates in true lat-lon values, so users can simply call ds.sel(lat=a, lon=b). For the trouble with WRF, I'd rather blame WRF's confusing output format than blame users.

Directly using projected coordinates seems quite rare in atmospheric modeling. The only use case I am aware of is deriving numerical schemes on curvilinear grids. For example, instead of using lat/lon to describe a cubed-sphere panel, you can use local, orthogonal variables ranging from -1 to 1, to make the math much cleaner (ref: Section 3.1. Gnomonic projection in Finite-volume transport on various cubed-sphere grids). But this notation is never exposed to users -- cubed-sphere models like FV3 write output data with true lat/lon coordinates.

For conservative regridding, it is important to stay in the true lat/lon space. Most projections, except equal-area ones, do not preserve the cell area. Conformal projection preserves local shape but distorts the area ratio between cells. Doing conservative regridding in the projected space does not preserve the area-weighed sum on the true sphere.

use the 2D lons and lats to plot them on a cartopy map (which is already slightly wrong because it the requires to use the _infer_interval_breaks trick from xarray, which is only an approximation)

This plotting issue is because plt.pcolormesh's plotting mechanism uses cell boundaries (MITgcm/xmitgcm#15 (comment)), not because of the use of lat/lon instead of projected coordinates. Even on orthogonal, regular grid, you still need to pass boundaries to plt.pcolormesh.

I think that many people in the python/xarray ecosystem would benefit from a unified handling of CRS in a tool which will hide all this complexity from them.

Instead of a "unified handling" between GIS and modeling, I'd prefer having a convenient converter between the two, while allowing each community to keep their existing culture.

@djhoese
Copy link
Contributor

djhoese commented Aug 22, 2018

Instead of a "unified handling" between GIS and modeling, I'd prefer having a convenient converter between the two, while allowing each community to keep their existing culture.

I see these as the same thing and that is at least the initial goal for geoxarray. You have lon/lat coordinates, great use them. You have projection information following the CF standard, go for it. You have a PROJ.4 string and X/Y coordinates, you can use that too. So the data is in the format the user is used to, but libraries can use a unified tool for handling the conversion and handling the data.

@rabernat
Copy link
Member Author

cc @adcroft, who may be interested in the discussion

@stale
Copy link

stale bot commented Oct 29, 2018

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@rabernat
Copy link
Member Author

Definitely not resolved StateBot!

@Meteodan
Copy link

Meteodan commented Jul 2, 2019

Hi all,

I've been reading this thread with great interest, since I am currently in the process of refactoring some old kludgy code from my graduate school days that handles analysis and plotting of high-resolution cloud-resolving model output. The CRMs I work with all have the common thread that they are based on a staggered grid in the horizontal and vertical (specifically an Arakawa C-grid). But many of them are designed to run in "idealized" mode where the grid is not typically georeferenced, by which I mean it does not specify the horizontal coordinates in terms of lat/lon, but rather a more generic physical rectilinear grid. I personally run these and other models both in "idealized" mode and in the "real-data" geo-referenced mode discussed in this thread, and I know this is very common, especially in the high-resolution storm modeling community. So first, I just wanted to bring up this particular use case to add to the very good discussion above, since I think it was mentioned only in passing.

Second, my code does handle locating and plotting the various variables on their native grid, whether it be the cell centers or faces, but it does so in a rather application-specific and kludgy way. I plan on putting it up on GitHub as soon as I can get through this current refactoring session, which is long overdue. Hopefully that won't take too terribly long. More to the point though, when I started the refactoring, I thought, "Surely someone has written a library somewhere that handles all the grid-related manipulations associated with staggered grids already, so I may not have to waste as much time cleaning my own implementation up". That's when I stumbled on @ChrisBarker-NOAA 's gridded package and later this thread. In my admittedly cursory look so far, it's not clear to me whether the "gridded" package can handle the aforementioned case where the grid is not represented in lat/lon coordinates. I could generate "fake" lat/lon coordinates using, say, a flat earth approximation, etc., but I was hoping there would be a more "native" way.

Any insight and discussions would be greatly appreciated, and I'm happy to continue this sort of discussion and help out in the larger efforts being discussed here as much as I can.

@ChrisBarker-NOAA
Copy link

@Meteodan:

Gridded does currently assume lay-Lon coords. But other than the naming, it is actually treating them as orthogonal coordinates. That is, not handling projections or wrapping around the earth, etc.

So it would be easy to adapt to other coordinates. In fact, I’d like to do that anyway to support projected coordinates anyway.

@Meteodan
Copy link

Meteodan commented Jul 2, 2019

@ChrisBarker-NOAA

Thanks for the reply! So if I'm understanding correctly, I could just pass in my coordinate arrays for the node_lat, node_lon, etc. keyword arguments and it'll just work?

Perhaps this isn't the best place for this, but is there a way to use gridded to compute "corner" points of a staggered grid? I need this for using matplotlib pcolor. Would it be as simple as just passing in the appropriate coordinate arrays for the edges/centers when constructing a Variable object and selecting them appropriately in the call to pcolor?

@stale
Copy link

stale bot commented Aug 31, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Aug 31, 2019
@stale
Copy link

stale bot commented Sep 7, 2019

This issue has been automatically closed because it had not seen recent activity. The issue can always be reopened at a later date.

@stale stale bot closed this as completed Sep 7, 2019
@rabernat
Copy link
Member Author

I just learned about some new packages for geo-aware grid interpolation.

@rabernat rabernat reopened this Sep 20, 2019
@stale stale bot removed the stale label Sep 20, 2019
@yosoyjay
Copy link

yosoyjay commented Nov 18, 2019

Summarizing some offline discussion with @rabernat, @kwilcox, @hetland and @ocefpaf on efficient and accurate approaches to interpolation and the requisite identification of a point-in-a-cell in unstructured triangular, curvilinear quads, and mixed types (tris + quads):

cc @rsignell-usgs

@ChrisBarker-NOAA
Copy link

and suggests using a Cell Tree method (Garth and Joy, 2010) as a solution.

Implementation in 2D here:

https://github.com/NOAA-ORR-ERD/cell_tree2d

(and on pip and conda-forge)

Not all that well documented, but ask if you want to figure out how to use it.

We are using it operationally for triangular and quad grids, via gridded.

The tree itself is built on bounding boxes, so you use any shape cell, as long as you provide a way to do a final "point in cell" check.

The paper does talk about it being amenable to multiprocessing, but it was not obvious to us how to do that, and we didn't try -- GPU or conventional.

@ChrisBarker-NOAA
Copy link

@Meteodan:

in reply to the above -- I think so -- but post a issue on the gridded repo to discuss.

https://github.com/NOAA-ORR-ERD/gridded

-CHB

@ChrisBarker-NOAA

Thanks for the reply! So if I'm understanding correctly, I could just pass in my coordinate arrays for the node_lat, node_lon, etc. keyword arguments and it'll just work?

Perhaps this isn't the best place for this, but is there a way to use gridded to compute "corner" points of a staggered grid? I need this for using matplotlib pcolor. Would it be as simple as just passing in the appropriate coordinate arrays for the edges/centers when constructing a Variable object and selecting them appropriately in the call to pcolor?

@hetland
Copy link
Member

hetland commented Nov 20, 2019

I wrote some thoughts on interpolating from a quad grid to random points over on the discourse site.

Many of these topics are already covered here and elsewhere, but I wanted to gather these ideas and put them in one place. Also, since it's just a discussion of algorithms, not specific code or packages, discourse seemed more appropriate. I'd be happy to move the discussion here if that's more appropriate.

@botzill
Copy link

botzill commented Dec 3, 2019

Hi.

Sorry about this stupid question, I'm new to this but, is there any of the tool enumerated that can help me to divide the entire world map into regions/rectangles/circles with radius R? So that I can play with the R and give me different number of regions. In the end I would like to get the coordinates(lat, long) of this regions or the coordinates of the rectangles(top/left x bottom/right) as well as lat, long. Any help into a right direction will be really appreciated.

Thx!

@ChrisBarker-NOAA
Copy link

ChrisBarker-NOAA commented Dec 3, 2019 via email

@botzill
Copy link

botzill commented Dec 3, 2019

Thx @ChrisBarker-NOAA.

I know that will not be perfect but this is not an issue, we can adjust some overlapping, I'm not looking to be perfect, an approximation works as well.

Thx.

@stale
Copy link

stale bot commented Feb 1, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Feb 1, 2020
@stale
Copy link

stale bot commented Feb 8, 2020

This issue has been automatically closed because it had not seen recent activity. The issue can always be reopened at a later date.

@bluetyson
Copy link

Interesting discussion, thanks!

@bluetyson
Copy link

Came across this looking for anyone that might have done things on grid merging dealing with edge effects

@hetland
Copy link
Member

hetland commented Oct 5, 2020

I'm not sure exactly what you mean. Can you provide an example?

@RichardScottOZ
Copy link

Things like Grid Merge in Intrepid - but open sourced version, not mathematically rolling your own.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests