Weights calculation with `parallel=True` still performed locally #399

phofl · 2024-11-05T13:06:52Z

I was looking into the regridder API for some benchmarks that we are running (see here) and noticed that the weights calculation is always performed client side and also materialized before it is put into the graph.

This creates a pretty large task graph if you want to send the computation to a remote cluster somewhere and is also not super efficient, since you are restricted to your local resources.

I'd be happy to take a stab at keeping the weights calculation lazy and only execute it when needed. Just wanted to check in beforehand to see if there are reasons why this is implemented in the current way that I am not aware off.

aulemahal · 2024-11-05T18:54:37Z

That would be incredible, if the weights computation could be lazy. That would also solve some issues where the target and/or destination grids are so large that weights generation gets out of memory.

The main reason it is not implemented within ESMF. It is non-python and uses MPI for parallelization. Also, in the perspective of xESMF, the weights generation is one monolith black box, it's not easily divided in dask tasks. (Another important reason is that xESMF doesn't have a real team. We are only three maintainers and these days we don't do much more than maintain...)

You can go look at xesmf.frontend.Regridder._init_para_regrid which is the furthest we got into lazy/parallel weights generation. You'll see that it's not exactly simple or efficient. It is also limited to the "processes" scheduler of dask. and it only actually helps when the output grid is very large, not the other way around. It was implemented in #290, you can see some discussion there.

This PR was part of an internship of @charlesgauthier-udm , maybe he remembers other limitations in the parallelization of ESMF that I haven't noted ?

However. In my personal opinion, this might not be the best way to go? Depending on a quite heavy non-python dependency is a weight. (It's not on windows, not on pypi). Even though I haven't done much experiments there, I have the feeling that better path for regridding would be to go through pythonic geometry module (shapely, pyproj, xvec) ? Extend xarray-regrid to understand curvilinear / irregular grids ? Make a better grid to geometry conversion ? (in cf-xarray ?)

charlesgauthier-udm · 2024-11-05T19:22:51Z

I think @aulemahal gave a great overview of the limitations of ESMF parallelization. While we were figuring out parallel weight generation we did consider dask-mpi as a potential way to couple ESMF's MPI implementation with Dask's lazy arrays but it ended up being out of reach for the internship. Perhaps it could be of use for this.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Weights calculation with `parallel=True` still performed locally #399

Weights calculation with `parallel=True` still performed locally #399

phofl commented Nov 5, 2024

aulemahal commented Nov 5, 2024 •

edited

Loading

charlesgauthier-udm commented Nov 5, 2024

Weights calculation with parallel=True still performed locally #399

Weights calculation with parallel=True still performed locally #399

Comments

phofl commented Nov 5, 2024

aulemahal commented Nov 5, 2024 • edited Loading

charlesgauthier-udm commented Nov 5, 2024

Weights calculation with `parallel=True` still performed locally #399

Weights calculation with `parallel=True` still performed locally #399

aulemahal commented Nov 5, 2024 •

edited

Loading