Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add conservative resampler #440

Open
zxdawn opened this issue Jul 3, 2022 · 12 comments
Open

Add conservative resampler #440

zxdawn opened this issue Jul 3, 2022 · 12 comments

Comments

@zxdawn
Copy link
Member

zxdawn commented Jul 3, 2022

The conservative resampler is

The value of a destination cell is calculated as the weighted sum of the values of the source cells that it overlaps.

according to ESMPy.

It's also supported in xesmf.

If we can add it to pyresample, it would be nice.

@djhoese
Copy link
Member

djhoese commented Jul 4, 2022

Would you want to use this in satpy? If satpy could access the implementation in xesmf would that be good enough? Would you be using xarray objects with it?

Also, there are some weighted resampling functions already in numpy-based parts of pyresample. Also EWA, which is in pyresample, is a weighted average but requires the data to be scan based (ex. Polar-orbiters).

@zxdawn
Copy link
Member Author

zxdawn commented Jul 4, 2022

Yes, I wanna use it with satpy. I have run it successfully by converting the Scene to Dataset and regridding it by xESMF. Because xESMF needs specific lon/lat/lon_b/lat_b variables, it's harder to access it in Satpy.

And, I tried to reproduce xESMF results using EWA resampler, but the result looks strange. Here's the comparison notebook

@zxdawn
Copy link
Member Author

zxdawn commented Jul 5, 2022

Here's the concept of conservative resampler which considers the weighted average:

image

@djhoese
Copy link
Member

djhoese commented Jul 5, 2022

For EWA, you'd want to play with these parameters:

weight_distance_max (float):
distance in grid cell units at which to
apply a weight of `weight_min`. Default is
1.0. Must be greater than 0.
weight_delta_max (float):
maximum distance in grid cells in each grid
dimension over which to distribute a single swath cell.
Default is 10.0.

Set weight_distance_max to 2.0 and weight_delta_max to 40.0 as a starting point. That's what I use for VIIRS data.

EWA definitely only uses the "center" points of the pixels. It assumes a pixel size based on the curvature of the earth and the rows per scan. So if tropomi isn't scan based then what you've done with setting rows_per_scan to the total number of rows is "good enough", but also means EWA is less likely to make pretty results compared to other algorithms.

We don't have any resampling algorithms in pyresample that use other information beyond the center of the pixel. I think the best route forward would be to add functionality in satpy to allow for this resampler to be used from xesmf. If you could make it have nice defaults for the bounding coordinate information so that sensors without this information could still use it that'd be really cool, but an exception being raised would be fine with me too.

@zxdawn
Copy link
Member Author

zxdawn commented Jul 5, 2022

Got it. For the new conservative resampler, here's my idea: If lon_b and lat_b are not available, we can create them following my notebook method which uses xgcm. I suppose satpy-supported datasets usually just have center points?

Oh ... Maybe it's better to make xESMF create the lon_b and lat_b automatically if they don't exist. Then, it would be easy to call it in pyresample.

@djhoese
Copy link
Member

djhoese commented Jul 5, 2022

If it makes mathematical sense to compute the bounds from the center points then sure that seems interesting. I see your xesmf issue above. Another option is we could make SwathDefinitions be smarter about additional metadata (like rows_per_scan or bounding information) and have methods on it for returning or generating that information from the information contained in the SwathDefinition.

@ghiggi
Copy link
Contributor

ghiggi commented Jul 6, 2022

I was just working these days on something related to that and I have made the following thoughts.

I wonder if we could add methods to AreaDef and SwathDef to retrieve:

  • cell centers arrays (N, M) --> get_lonlats() [ALREADY AVAILABLE]
  • cell corners arrays (N+1, M+1) --> get_lonlats_corners() [MISSING]
  • cell bounds arrays (N, M, nv) --> get_lonlats_bounds() [MISSING]

In addition to being useful for conservative remapping with xESMF, these methods could be used to develop the following ones:

  • SwathDefinition.upsample
  • SwathDefinition.extend
  • area = area_def.get_cell_area() # swath_def.get_cell_area()
  • perimeter = area_def.get_cell_perimeter() # swath_def.get_cell_perimeter()
  • res_x, res_y = get_cell_resolutions() # swath_def.get_cell_resolutions()

For pyresample structured grids, the corner and bounds would be guessed assuming equal spacing on either side of the coordinate labels. Then the question would be if to provide the option to infer the spacing in

  • lat/lon projection
  • the AreaDef projection
  • In geocentric x,y,z coordinates (which would be the default solution I guess)

What do you think?

@zxdawn
Copy link
Member Author

zxdawn commented Jul 6, 2022

The idea of get_lonlats_corners() and get_lonlats_bounds() is good. I tried to dig deeper how xgcm calculates the boundary for 2D coordinates ... And, I found C-grid but then lost in the codes .... cc @jbusecke

@djhoese
Copy link
Member

djhoese commented Jul 6, 2022

@ghiggi When you say "For pyresample structured grids", do you mean an AreaDefinition? Then I think any spacing decisions should be in the area definitions projection space. For SwathDefinitions I think you have to do it in geocentric.

@ghiggi
Copy link
Contributor

ghiggi commented Jul 6, 2022

@djhoese ... it was my natural mental separation from resampling approaches related to spherical unstructured grids.
In such a case the (node) coordinate is 1D, and the simplest way to retrieve "a mesh" is to use SphericalVoronoi.
But depending on the knowledge of the background spherical sampling, the mesh can be inferred for the Healpix, (Reduced) Gaussian Legendre, Icosahedral, Octahedral and Equiangular samplings. I think it goes outside the pyresample scope. If one day you want to include also the unstructured grids, I have some code for this scope.

I can draft the methods described above in the coming days :D

@djhoese
Copy link
Member

djhoese commented Jul 6, 2022

Sure. Yeah. I understood that. 😕 <whoosh emoji>

@jbusecke
Copy link

jbusecke commented Jul 7, 2022

The idea of get_lonlats_corners() and get_lonlats_bounds() is good. I tried to dig deeper how xgcm calculates the boundary for 2D coordinates ... And, I found C-grid but then lost in the codes .... cc @jbusecke

I have just mentioned that in xarray-contrib/cf-xarray#71 (which might have some interesting overlap with this disussion? Even though I am not sure CF conventions are general enough for the problems described here), but I think the ability to infer grid cell bounds/corners(nodes) should really never have been in xgcm (blame a green OSS contributor named Julius for not realizing that back then...).

Xgcm should consume fully defined finite grids, not produce/infer them. As I mentioned over in the above issue, I am planning on removing the 'autogenerate' functionality from xgcm, but a general replacement would be very beneficial it seems for the broader community.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants